Set up a BYO Kubernetes host

Run agent sandboxes as pods in a Kubernetes cluster you own. Install a small operator, and every sandbox launches inside your boundary, under your network policy and your audit, while the platform still supplies the image, the runtime, and the policy. The platform never touches your cluster API; the operator reaches out.

A Bring-Your-Own Kubernetes host runs agent sandboxes as pods in a cluster your organization owns: a managed AKS, EKS, or GKE cluster, or an on-premises one of your own. You install a small operator into the cluster, and from then on every sandbox from a profile pointed at this host launches there, scheduled and torn down inside your boundary. This is the option for the team that needs agent execution to live on infrastructure it governs. This page is the how-to.

The platform never calls your cluster's API. The operator you install reaches out to the platform, never the other way around, so there is no inbound port to open and no endpoint of yours exposed to the internet. For the deployment choice this sits inside, read sandbox hosts and deployment options; for the connectivity model in depth, read network and connectivity. This page assumes you have decided on Kubernetes.

Before you start

A few things need to be in place:

A Kubernetes cluster you control, and kubectl access to it with permission to create a namespace, a service account, RBAC, and a deployment. The operator and the sandbox pods live in their own namespace, so it does not mix with the rest of your workloads.
Permission to manage hosts. Registering a host rides on the hosts.manage scope, and bringing your own host is gated by an entitlement on your workspace's plan. If you cannot see the Register host button, that is what to check with an admin.
Your workspace set up for bring-your-own hosts. The operator connects over a managed messaging channel that an admin enables for your workspace. When it is not configured, the operator bundle download is held back rather than handing you a bundle that cannot connect.

Register the host

Go to Platform → Sandbox Hosts and choose Register host. Pick the Kubernetes operator kind, give it a name you will recognize, prod-eks, eu-cluster, and set the runtime families it should serve, the language runtimes your profiles need. The name you choose becomes the host's identifier, fixed once you save. Save it.

The host appears in the list as Awaiting operator: the record exists, but nothing is running against it yet. The next step is to install the operator that claims it.

add_photo_alternate

Screenshot to capture

The Register host page (a full page), dark theme, breadcrumb 'Platform / Sandbox Hosts / Register host'. Fields: 'Display name' input 'prod-eks' with a hint 'Host id will be prod-eks and cannot change'; an optional 'Description' textarea; 'Operator kind' a dropdown set to 'Kubernetes operator (v1)' (the list also shows 'Local Docker operator' and two greyed reserved entries 'Azure Container Apps operator (reserved)' and 'Custom / non-strategic'); 'Runtime families' a comma-separated text input 'node, python'; an optional 'Registration endpoint' input; a 'Tags' comma-separated input 'prod, eu'; and an 'Identity capabilities' section with an unchecked 'Azure Workload Identity' checkbox. A primary 'Register' button and 'Cancel'.

save as: public/docs-media/k8s-host-register.png

Caption when added: Registering a Kubernetes host creates the record. The display name becomes its fixed id, and the host shows Awaiting operator until the operator you install claims it.

Install the operator

A host record on its own does nothing; the operator is the small deployment that actually launches pods. From the host's detail page, choose Download operator bundle. For a Kubernetes host the bundle is a Kubernetes manifest, a single YAML file you apply with kubectl. The host page shows the two commands to run: a kubectl apply -f to install it, then a kubectl rollout status to watch it come up.

The Kubernetes install: register the host, download the manifest, apply it with kubectl, and the operator dials out to claim the host. The connection runs outward; nothing inbound is opened.

Applying the manifest installs a tidy, self-contained footprint in one namespace:

a namespace of its own, so nothing mixes with your existing workloads,
a service account for the operator and a separate one for the sandbox pods,
RBAC scoped to that namespace and nothing wider, granting the operator only what it needs to manage pods and service accounts there,
the operator deployment itself, a single small replica.

Once it is running, the operator dials out to the platform, claims the host, and begins checking in. Watch the host's status move from Awaiting operator to Registered. You never opened a port; the operator reached out and connected itself.

add_photo_alternate

Screenshot to capture

The operator-install section of a Kubernetes host detail page, dark theme. A heading 'Operator bundle' with a 'Download bundle' button and a note 'Downloading issues a fresh key and retires the previous one.' Below it an 'Apply' shell code block with three lines: 'cd ~/Downloads', 'kubectl apply -f ./prod-eks-operator.yaml', and 'kubectl -n disco-parrot rollout status deployment/disco-parrot-operator'. Beside it a collapsed preview of the manifest showing Kubernetes kinds 'Namespace', 'ServiceAccount', 'Role', 'RoleBinding', 'Secret', 'ConfigMap', 'Deployment'. A small status line 'Credential issued 2 minutes ago'. Breadcrumb 'Platform / Sandbox Hosts / prod-eks'.

save as: public/docs-media/k8s-operator-bundle.png

Caption when added: The operator bundle for a Kubernetes host is a single manifest: a namespace, two service accounts, namespace-scoped RBAC, and one small operator deployment. Apply it, watch it roll out, and the operator connects itself.

The key inside the bundle is single-use: each download issues a fresh one and retires the previous, so if you re-download, re-apply the new manifest, otherwise the running operator is holding a key that no longer counts. Generate it once, apply that manifest, and the key is settled.

What runs in your cluster

The footprint is deliberately small and easy to reason about, which matters when a security review asks what you just installed. The operator is one deployment in one namespace, with RBAC that reaches no further than that namespace and covers only pods and service accounts. It opens no ports and exposes no service. When a profile pointed at this host launches a sandbox, the operator creates a pod in the namespace, sized to the profile's resource class, and removes it when the work is done.

What the operator installs: one namespace holding a single operator deployment, two service accounts, namespace-scoped RBAC, and the sandbox pods it creates on demand. No service, no ingress, no listening port.

For a reviewer who wants the precise grant rather than the summary, here it is. The operator's RBAC, scoped to the one namespace, lets it create, read, watch, and delete pods, and get and create service accounts, and nothing else: no secrets, no nodes, nothing cluster-wide. The operator deployment is small, requesting a tenth of a CPU and 128Mi of memory and capped at half a CPU and 256Mi. Its install key sits in a Kubernetes Secret in the namespace while the non-sensitive settings sit in a ConfigMap, and the messaging-channel address is not in the bundle at all. The operator fetches that at runtime when it exchanges its key, so there is no standing connection secret living in your cluster.

Because the pods are scheduled by your cluster, they inherit your cluster's posture: your node pools, your network policy, your admission controllers, your audit. The agent run that would have happened on managed compute now happens under your controls. Nothing about how the sandbox is built changes. The image, the runtime, the tools, and the change policy are the same as on the managed host; only the place it runs does.

Each sandbox pod is ephemeral by design and hardened the way your own workloads should be. It runs as a non-root user, drops every Linux capability, cannot escalate privileges, and carries a service account with no access to the Kubernetes API at all, so a sandbox does its work without being able to reach into your cluster. Its workspace is scratch space that exists only for the life of the pod. A run gets a clean pod, does its work, and the pod is torn down, so nothing carries over from one run to the next and nothing is left behind on your nodes. A sandbox counts as ready only once its pod is running and its sidecar answers a health check; if it cannot, the pod is torn down and the launch fails fast rather than handing you a half-up sandbox.

Let pods authenticate as a workload identity

When a sandbox on your cluster needs to reach a cloud resource, a database, a storage account, a managed service, the right way to let it is a workload identity, not a secret baked into the environment. On a Kubernetes host, Disco Parrot uses the standard Azure Workload Identity model: the operator gives each sandbox pod a service account annotated with your managed identity, and your cluster's identity webhook projects a short-lived token into the pod. The pod authenticates to Azure as that identity, the token lasts minutes, and no long-lived credential ever sits in the sandbox.

This is the one part of the setup that lives in your cloud rather than in Disco Parrot, because it is your identity to grant. There are three pieces to put in place, once per cluster:

The Azure Workload Identity webhook installed in the cluster (on AKS this is the workload-identity add-on; on other clusters it is the open-source webhook).
Your cluster's OIDC issuer registered with Azure, so Azure will trust tokens the cluster projects.
A managed identity with a federated credential that trusts your cluster's issuer and the sandbox service account, plus whatever cloud roles you want that identity to hold.

With those in place, you turn on Azure Workload Identity on the host, and each profile that should use it names the managed identity (its client id) and tenant. From then on, a sandbox launched from that profile comes up already able to authenticate as that identity. The full credential and identity model, and how this fits the rest of an agent's reach, is approved actions; for the governance view, what you grant, what you can revoke, and what a reviewer can verify, see identity and cloud access.

A few concrete values tie this together, and one of them is where teams usually trip. The federated credential you create in Azure has three parts to get right:

subject, the service account the operator runs the identity's pods under (it manages a dedicated service account for each managed identity you wire up),
audience, the standard api://AzureADTokenExchange,
issuer, your cluster's own OIDC issuer URL (on AKS, read it with az aks show).

Bind the federated credential to that issuer and that service account, and it trusts exactly the pods this host launches and nothing else.

How a pod authenticates to a cloud resource without a stored secret. Your cluster signs a token, a federated credential in your cloud trusts that issuer and the sandbox account, and the identity webhook projects a short-lived token into the pod. The first links live in your cloud; the rest in your cluster.

add_photo_alternate

Screenshot to capture

The 'Identity capabilities' section of a Kubernetes host detail page, dark theme. A 'Cloud identity' panel with a checked 'Azure Workload Identity' checkbox and an info note beside it reading 'Enable only when the underlying cluster is configured for Azure Workload Identity (AKS with the AWI webhook and a federated identity bound to the sandbox service account). When enabled, sandboxes on this host receive an Azure AD token via the projected service account, with no client secrets in the sandbox container.' Breadcrumb 'Platform / Sandbox Hosts / prod-eks'.

save as: public/docs-media/k8s-workload-identity.png

Caption when added: The host's identity toggle, with the cluster prerequisites spelled out in the note beside it. Workload identity is set up once per cluster, in your own cloud; the host carries the switch and each profile names the identity it should run as.

What the operator reaches

The operator is an outbound client, and a security team will want the short list of where it connects. From inside your cluster it reaches exactly three destinations:

the platform endpoint it exchanges its key for a connection token at,
the managed messaging channel (Azure Web PubSub) it holds its connection open over,
the platform's container registry, where the sandbox image lives.

That is the whole egress story for the operator. Sandboxes themselves reach only what their work needs and their leased credentials allow, your Git provider, a package registry, the cloud resource a workload identity grants. There is nothing inbound to allow, ever. The full picture, the channel, the tokens, and how a host recovers if its connection drops, is network and connectivity.

Air-gapped and restricted-egress clusters

Many of the clusters this matters most for cannot reach the public internet freely, and the model is friendly to that. The operator's whole world is the three outbound destinations above, so an egress allowlist or proxy that permits them is all it needs. One nuance worth planning for: the messaging channel's address is handed to the operator at key-exchange time rather than baked into the bundle, so a proxy has to permit the Web PubSub host the operator is given, not a fixed one you could write down in advance.

The destination most likely to bite in a locked-down cluster is the image pull. If your nodes pull through an internal mirror or a proxy, make sure the sandbox image is reachable that way, because a blocked pull stops the pod at schedule time, which surfaces as the launch failing fast rather than a sandbox half-coming-up. And for the strictest clusters there is a piece of good news: workload-identity token projection happens inside the cluster, through the webhook, so turning on cloud identity adds nothing to your egress at all.

Keeping it healthy

A Kubernetes host is healthy when its operator is connected and able to do its job, and the Sandbox Hosts list is where you read that at a glance: a registered host with a recent last-seen is working; one whose connection has dropped shows it, alongside the operator version it last reported. Behind that status, the platform confirms the things that have to be true for the operator to launch pods, that it can reach the cluster API, that it owns its namespace, that its RBAC is sufficient, and that it is connected to the messaging channel, so a host that cannot do one of those does not quietly look fine. You read the connection itself off the list, the last-seen and the operator version; the cluster-side facts, the namespace, the RBAC, the API reach, you can confirm directly against the cluster with kubectl whenever you want the ground truth.

add_photo_alternate

Screenshot to capture

The Sandbox Hosts list under Platform, dark theme, focused on a Kubernetes host row. The grid shows columns Name, Host ID, Kind / target, Runtime families, Identity, Scope, Status, Bundle, Last seen, Operator version. The 'prod-eks' row reads: Kind 'Kubernetes operator', Runtime families 'node, python', Identity 'Azure WI', Scope 'Current', a green 'Registered' badge, Bundle 'Download', 'Last seen 5s ago', 'v1.4.2'. Breadcrumb 'Platform / Sandbox Hosts'.

save as: public/docs-media/k8s-host-status.png

Caption when added: A Kubernetes host's live status, last-seen, and operator version read off the Sandbox Hosts list: a registered host checking in regularly is connected and working.

If the operator goes down, redeployed, drained, a node lost, the sandbox pods it created keep running on your cluster, because they are pods on your compute, not processes on ours. The connection shows as dropped until the operator is back, and when it reconnects it reconciles with the pods that survived, re-pointing each running sandbox at a fresh token rather than restarting it. A blip does not throw away in-flight work. The full recovery behavior is covered in network and connectivity.

Updating and retiring the operator

Keeping the operator current is an ordinary Kubernetes update: re-apply a newer bundle, or roll the deployment to a new image, the way you update anything else in the cluster. Your install key keeps working through a version bump, because it only changes when you deliberately download a fresh bundle, so a routine update is not a reconnection chore.

When you are done with a host, retiring it is a deliberate step. Deleting a host from the Sandbox Hosts list stops new sandbox pods from scheduling and releases any capacity it was holding, and it names the profiles that still point at it so you can move them first. Pods already running in your cluster are not pulled out from under their work; they finish on their own, because they are your compute. Removing the operator itself is a kubectl delete of the manifest you applied, on your schedule.

add_photo_alternate

Screenshot to capture

A confirmation dialog over the Sandbox Hosts list, dark theme. Title 'Retire this host?'. Body text reads 'Deleting prod-eks stops scheduling new sandbox pods and releases any capacity it was holding. Pods already running in your cluster keep running until they finish. This cannot be undone.' A warning row below lists '2 profiles still target this host and will not launch until you move them' with two profile names 'Node 22 . Insights' and 'prod . eu' shown as chips. A destructive 'Retire host' primary button and a 'Cancel' secondary button. Behind it, the dimmed Sandbox Hosts list. Breadcrumb 'Platform / Sandbox Hosts'.

save as: public/docs-media/k8s-retire-host.png

Caption when added: Retiring a Kubernetes host stops new pods from scheduling and names the profiles still pointed at it, while pods already running in your cluster finish on their own, because they are your compute, not ours.

Standing up a cluster host, end to end

Marcus owns the Insights service, and a security review has drawn a clear line: agent work that touches production-adjacent systems has to run inside the company's own cluster, not on outside compute. He decides to give the team a Kubernetes host.

He registers one at Platform → Sandbox Hosts, names it prod-eks, sets the Node and Python families, and it shows Awaiting operator. He downloads the bundle, a single manifest, hands it to the platform engineer who runs the cluster, and a kubectl apply later the operator rolls out in its own namespace. Within seconds the host flips to Registered. The security reviewer asks the question they always ask, what did we have to open, and the answer is nothing: the operator dialed out, there is no inbound port, and its RBAC reaches no further than one namespace.

Then Marcus sets up the part that lives in their cloud. The cluster already runs the workload-identity webhook, so the platform engineer registers the cluster's issuer with Azure and creates a federated credential for a managed identity that can read the production data store. Marcus turns on Azure Workload Identity on the host and points the relevant profile at that identity. Now an agent working an Insights plan against this host comes up as a pod in their cluster, authenticates to the data store as a managed identity with no stored secret, and is torn down when it finishes. The work runs where the review said it had to, and Marcus can show exactly how, from the manifest, the host's health, and the audit trail.

Who can set one up

Registering and managing a Kubernetes host rides on hosts.manage, and bringing your own host is gated by an entitlement on your plan, so standing one up is an action for the people your organization trusts with infrastructure. Everyone with hosts.read can see the hosts that exist and their status. The cluster-side pieces, applying the manifest and setting up workload identity, are done by whoever administers the cluster and your cloud, which is usually the same platform or security team.

To send work to this host once it is running, point a sandbox profile at it on the profile's General tab. The host serves the runtime families it advertises, so a profile lands on a host equipped for the languages it needs.

add_photo_alternate

Screenshot to capture

The General tab of a sandbox profile, dark theme, focused on host selection. A 'Where it runs' panel with a 'Sandbox host' dropdown set to 'prod-eks (Kubernetes operator)'; the open list also shows 'Managed (default)' and 'local-dev (Local Docker)'. A read-only note beside it reads 'This host serves runtime families: node, python'. Below, a 'Cloud identity' row showing 'Azure managed identity' with a client-id field 'a1b2c3d4-...' and a 'Tenant' field, shown because the host has Azure Workload Identity enabled. A primary 'Save profile' button. Breadcrumb 'Platform / Sandbox Profiles / Node 22 . Insights'.

save as: public/docs-media/k8s-profile-target-host.png

Caption when added: Sending work to the cluster host is one field on a profile: pick it as the sandbox host on the General tab, and name the managed identity its pods should run as. The host serves only the runtime families it advertises.

Why BYO Kubernetes works this way

The thing a security team cannot accept is a black box running code in a place they cannot see or govern. So the model is built to be the opposite of that. The operator is a small, legible footprint, one namespace, one deployment, RBAC that reaches nothing else, and you install it yourself with a manifest you can read line by line. It opens no inbound port, so running it does not widen your attack surface. The sandboxes are pods on your cluster, under your network policy and your admission controllers, so the controls you already trust apply to agent work unchanged. The identity a pod authenticates with is yours to grant and revoke, set up in your own cloud, and it is a short-lived token rather than a secret. And the connection runs outward, from your cluster to the platform.

The result is that "run the agent inside our boundary" stops being a slogan and becomes something you can verify: a known footprint, a known egress, a known identity, and no door held open. You get the platform's environment on your own infrastructure, governed the way you already govern everything else that runs there.

For a platform engineer, this is a manifest you apply and a deployment you can watch, not a mystery. One namespace, scoped RBAC, a single operator pod, and pods that come and go like any other workload. You run it the way you run everything else in the cluster.

For a security owner, the answer to "where does the agent's code run and what did we expose" is concrete: it runs as a pod in a namespace you control, the operator reaches out so nothing reaches in, the egress is three destinations you can allowlist, and the identity it uses is one you granted and can pull.

For a lead, the cluster host is how you satisfy a hard requirement without slowing the team down. The profiles that have to stay in your cluster point at it; everything else runs managed; and the engineers launching work do not have to think about any of it.

For a planner, none of this is in your way. You hand work to an agent the same as always, and when a profile is pointed at the cluster host, the run lands there on its own.

lan

Sandbox hosts and deployment options

The deployment choice this sits inside: managed, Kubernetes, or Local Docker.

hub

Network and connectivity

The outbound channel, the short-lived tokens, and how a host recovers.

verified_user

Approved actions

The credential and identity model a workload identity fits into.