AI models & SDK configs

Choose the AI runtimes your agents run on. Connect Claude, Codex, Google Gemini, or GitHub Copilot with your own keys or on bundled spend, turn on the models you trust, set a workspace and a personal default, and test each one before you rely on it. The how-to for the runtimes behind every run.

Every agent runs on an AI runtime, and you decide which ones your workspace can use. Disco Parrot is not wired to a single vendor, so you are never betting that one lab stays ahead at everything: you connect one or more runtimes, on your own provider keys or on the spend bundled into your subscription with nothing to procure, turn on the specific models you trust, set the defaults, and the platform resolves the right runtime for each run. This page is the how-to for setting that up. For why multi-runtime routing matters, the resolution cascade in full, and how spend is metered, read the AI models concept.

The runtimes you can connect

A runtime is a provider and the way the platform talks to it. Four are supported, and each signs in its own way.

Runtime	What it is
Claude	The primary runtime, run through the Claude Agent SDK.
Codex	OpenAI's Codex, for coding and agentic work.
Google Gemini	Google's Gemini models, run through the Gemini SDK.
GitHub Copilot	Models served through your Copilot subscription.

You configure each on the AI models page under Platform, one card per runtime. A runtime has to be enabled before anything in the workspace can run on it, and you can have more than one enabled at once so different work runs on different providers.

Bring your own keys, or let Disco Parrot run it

You can power a runtime two ways, and you can mix them.

Bring your own keys. Connect a runtime with your own provider account, a Claude or Gemini API key, a Codex sign-in, a Copilot authorization, and the usage bills to that account. This is the path when you already have a provider relationship, or you want your own billing and your own rate limits.
Let Disco Parrot run it. Turn on managed spend and there is no provider to configure and no key to hold: the platform supplies the runtime and the models, and the usage draws from the spend bundled into your subscription. You turn it on and your team starts working the same day, with nothing to procure.

The two are not exclusive. A workspace can run entirely on managed spend, entirely on its own keys, or both at once, leaning on the bundled models for everyday work and pointing a specific runtime at your own account when you want a particular model or your own billing relationship. The resolution cascade picks the right runtime the same way regardless of who is paying for it, and a managed key, like a key you bring, stays on the server and never reaches the agent.

Choosing managed spend or your own keys

Most workspaces do not have to choose once and for all, because the two paths coexist per runtime. But each has a shape it fits, and naming it makes the call quick.

Reach for managed spend when you want to start now. There is no provider account to open, no key to procure, no finance approval to clear before the first run: you turn a runtime on and the team is working the same day, and the usage lands on the subscription you already pay. It is the path that gets a new workspace from nothing to running agents in an afternoon, and the path that keeps the AI line on one invoice instead of a separate provider bill for each lab.

Reach for your own keys when you already hold a provider relationship you want to keep, when a model you need lives behind your own account, or when finance wants the AI usage on the provider's own invoice and rate limits. Bringing a key changes nothing about how the platform routes or how a credential is held; it changes whose account the usage bills to.

A common shape is both at once: managed spend carries the everyday runtimes so the team is never blocked on procurement, and one runtime points at your own account for the model or the billing relationship you specifically want. The cascade does not care which is which, so you can move a runtime from bundled spend to your own key later without rewiring a single skill or profile that runs on it.

Two ways to power a runtime. They differ in who pays and what you set up, and they converge on the same routing and the same server-side credential.

Connect a runtime

Connecting a runtime with your own credentials means giving the platform the credential it needs to call the provider on your agents' behalf. (On managed spend there is nothing to connect here: the platform holds the runtime, and you turn it on and pick its models like any other.) How you connect your own depends on the provider:

Claude and Google Gemini connect with an API key. You store the key as a managed secret, and the runtime references it.
GitHub Copilot connects through its OAuth sign-in, the same GitHub authorization the rest of the platform uses.
Codex connects with a device-code sign-in to your account, or with an API key if you prefer.

Each runtime signs in its own way, and the credential stays server-side either way.

In every case the credential is stored server-side and referenced by the runtime, never pasted into a place an agent could read it. The platform calls the provider on the agent's behalf, which is why a key never reaches the agent: the runtime is the boundary, and the secret stays behind it.

add_photo_alternate

Screenshot to capture

The AI model (SDK config) detail page for Claude in the Platform area: an Authentication section with a 'Run on bundled spend' toggle at the top (off, with a one-line caption that managed spend supplies the runtime and meters usage to your plan) and below it the managed-secret status with a 'Configure secret' link for the bring-your-own path, a Status section with an 'Enabled' switch, a 'Tenant default' switch, and a 'Personal default' switch, a Models section with a checklist of model names (each with a checkbox and a star for the default), and a 'Test' action in the header. Dark theme.

save as: public/docs-media/sdk-config-detail.png

Caption when added: One runtime card: the credential it references, the enabled and default switches, and the short list of models turned on. Connect a second card and a workspace runs both at once.

Connect with an API key

Claude and Google Gemini connect with a key you hold with the provider. On the runtime's card, open Configure secret, paste the key once, and it is stored as a managed secret that the runtime references by name from then on. The card never shows the key back; it shows that a secret is set. Turn the runtime on, run the Test, and a green result means the credential and a model both answered. If your team rotates the key, you update the one secret and every run that resolves to this runtime picks up the new value, because nothing downstream ever held a copy.

GitHub Copilot connects the same way the rest of the platform uses GitHub: open its OAuth sign-in, authorize, and the platform holds the resulting session. There is no key to paste and none to rotate by hand.

Codex signs in with a device code rather than a pasted key. Choose the Codex-managed option and start the sign-in: the platform shows you a short code and a link, you open the link, enter the code, and authorize the connection with your account. The platform completes the handshake, stores the session server-side, and refreshes it silently from then on, so you sign in once and the runtime keeps working without you re-entering anything. Disconnect from the card to drop the stored session, and sign in again to reconnect. An API key is there as the alternative if your team would rather manage one.

add_photo_alternate

Screenshot to capture

The Codex authentication panel mid-sign-in: a 'Codex-managed' option selected over an 'API key' alternative, a card showing a verification URL and a short user code to enter, a 'Waiting for authorization' status with a spinner, and a 'Cancel' button. After success, a connected state showing the linked account and plan. Dark theme.

save as: public/docs-media/codex-device-code.png

Caption when added: Codex device-code sign-in: enter a short code on the provider's site once, and the platform keeps the session refreshed.

Choose which models are on

A runtime exposes a set of models, and you decide which of them your workspace can use. For providers that publish a model-list API, the list you see is fetched live when a key is present, so it reflects what the provider actually offers right now; without a key yet, or for a runtime the platform keeps on a curated set, the page shows a known list so you always have something to pick from.

The models you can pick from come live from the provider when a key is present, or from a curated fallback when not. You enable the ones you trust, and one is the default.

The models you turn on are an allowlist. Turn on the two or three you trust and leave the rest off, and those are the only models any person, skill, or Flow in the workspace can reach through that runtime. One of them is the runtime's default model, marked with a star, used whenever a run on that runtime does not pin a more specific model; star a model that is off and it turns on as the default at the same time. The allowlist is enforced at run time: if a skill or a sandbox profile pins a model the runtime no longer has on, the run stops and tells you rather than quietly swapping in a different one.

Which models to turn on

The allowlist is short on purpose, and a good one is three or four models, not the provider's whole catalog. A pattern that holds up across teams:

One strong reasoning model for the work that earns it: implementation, tricky design, the plan that has to be right. This is the model your heavyweight skills pin to.
One fast, cheaper model for the routine: cleanup, triage, summarizing a thread. The work that does not need the premium model should not cost like it does.
A model from a second provider, when you run more than one runtime, for the cross-provider check, so a change can be graded by a lab that did not write it.

Leave the rest off. A model nobody turned on is a model no skill, person, or Flow can reach, so the costly ones you do not want in everyday use stay out of reach by being off rather than by anyone remembering not to pick them. Turn a model on when a skill needs it and a person you trust asked for it, not because the provider shipped it. The shorter the list, the easier it is to reason about what your workspace can actually run.

Test a runtime

Before you rely on a runtime, test it. The test sends a small, real request to the provider and reports back whether the credential and the model both work, telling apart an authentication problem from a transient provider hiccup. A misconfigured key becomes an answer in seconds on this page instead of a failed run an hour later, and you rely on a runtime you have watched answer rather than one you hope is configured right.

add_photo_alternate

Screenshot to capture

The Test result panel on a runtime (SDK config) detail page in the Platform area after testing the Claude runtime: a header reading 'Test passed' with a green check, two result rows each with its own status icon, one labeled 'Credential' marked passed in green and one labeled 'Model' marked passed in green with the model name beside it, a muted 'Tested just now' timestamp, and a 'Re-test' button. A second, faint example state below shows a failure: a 'Credential' row marked failed in coral with the caption 'Authentication rejected', distinct from how a transient model error would read, so the panel tells an auth problem apart from a provider hiccup. Dark theme, green for pass and coral for the failed row.

save as: public/docs-media/runtime-test-result.png

Caption when added: Testing a runtime: the credential and a model both answer, and a failure says which one broke, so a bad key is an answer in seconds rather than a failed run an hour later.

Set the defaults

A runtime is chosen for a run through a cascade of defaults, and you set two of the rungs here.

The workspace default is the runtime a run uses when nothing more specific applies. One runtime carries it for the whole workspace.
Your personal default is the runtime you prefer for your own runs. It sits above the workspace default, so you can work on the provider you like without changing anyone else's setup.

Every runtime also has an enabled switch. A disabled runtime is never chosen, which is how an admin turns a provider off for the whole workspace without deleting its configuration, and a connection you are still setting up stays out of the way until you switch it on.

add_photo_alternate

Screenshot to capture

The AI models (SDK configs) list page in the Platform area: a row per runtime (Claude, GitHub Copilot, Codex) each with its brand icon, a status dot ('Active' / 'Not configured' / 'Disabled'), and badges where they apply ('Tenant default' on Claude, 'Personal default' on Codex). A header describing the page. Dark theme.

save as: public/docs-media/sdk-configs-list.png

Caption when added: The configured runtimes, each with its status, and the workspace and personal defaults marked.

How a runtime gets chosen

When a run starts, the platform resolves a runtime through a cascade, and the first option that is both set and enabled wins. The full reasoning lives on the concept page; the order is what to know here.

The runtime for a run is resolved top to bottom. The first one that is set and enabled wins.

A skill's pinned runtime, when the skill running carries one.
A resumed session's runtime, so a conversation you pick back up keeps the runtime it was already on rather than switching underneath you.
Your personal default.
The sandbox profile's runtime, when the profile the run uses specifies one.
The workspace default.
The first enabled runtime, as a final fallback, and when more than one is enabled the platform prefers Claude so the fallback is predictable. If you have turned every Claude runtime off, the platform refuses this fallback rather than reaching for a disabled one, so a deliberate choice to disable a provider is never quietly undone.

Disabled runtimes are skipped at every rung, and the resolved runtime also has to be present in the sandbox image the run uses.

When a runtime is not in the sandbox image

Resolving a runtime is one half of the picture; the other is that the runtime has to actually be present in the sandbox image the run uses. The image carries each provider's command-line bundle, and a slimmer image may not carry all of them. When the cascade lands on a runtime the image does not include, the run does not fail somewhere deep and leave you guessing: it reports the gap up front, naming the runtime it resolved and the image that is missing it.

The fix is one of two moves. Point the run at a profile whose image includes that runtime, or set the runtime so the cascade lands on one the image does carry. This is why testing a runtime on this page and seeing it run in a real session are two different checks: the test proves the credential and the model answer, while the image decides whether the run can reach that runtime at all. A green test plus a profile on the right image is the pair that means a runtime is ready end to end.

Run several at once

A workspace can have more than one runtime enabled at the same time. That is what lets you route by provider as well as by model: a skill on one lab, a profile on another, a person on the one they prefer, all live in the same workspace at once.

Implement a plan

the heavy lift

a top-tier model

Verify the work

an independent check

a different providergraded by a model that did not write it

Routine chores

cheap and fast

a low-cost modelsave the premium model for where it counts

Anything unset

the fallback

workspace default

Pin a model to each kind of work. A different provider can check the first one's output, and routine work runs on a cheaper model.

Running more than one runtime at once, and pinning a runtime or off-default model to a skill or profile, is part of the multi-provider capability on your plan. A workspace without it runs a single enabled runtime, the workspace default, which covers the everyday case; a workspace with it can route by skill, by profile, by person, and by workspace all at once. You set which models are enabled per runtime no matter which plan you are on, so the set a workspace can reach is the short list you turned on, never the provider's whole catalog by default.

A cross-provider check, set up once

Sarah's Analytics team trusts Claude for implementation work on the Insights project, and they want the review graded by a different lab, so a model is never the only judge of its own output. Setting that up is two runtimes and two pins.

First she connects both. Claude is already on, carrying the workspace default. On the AI models page she adds a second runtime, signs it in with its own key, turns on the one model the team wants for review, tests it, and enables it. Two providers are now live in the workspace at once.

Then she routes the work, not by choosing a model each run but by pinning one to each skill. The team's implement skill is pinned to Claude; the verify skill is pinned to the second provider. The pins travel with the skills, so a plan that moves through In progress into In review is written on one lab and checked by another without anyone touching a dropdown. When Tom runs the implement skill and Maya's CSV Export Flow reaches its review step, each lands on the runtime its skill carries.

Nothing else in the workspace changes. Routine cleanup still runs on the workspace default; a person who set a personal default still gets it. The cross-provider check is now a standing rule on the two skills that need it, not a habit the team has to remember.

What it costs and how it is metered

How you pay depends on the path. On your own keys, the bill is the provider's, on the account whose key the runtime references, and you read it in that provider's own console; Disco Parrot holds the key and calls the provider but is not in the middle of that invoice. On managed spend, the usage draws from your subscription and Disco Parrot meters it for you.

The managed metering is detailed. Disco Parrot records the billable AI usage behind your runs and reports it as a running monthly figure, broken down by model, by person, by project, and by cost center, and you can export the line items to CSV when finance wants them in its own tools. Usage that has not been priced yet shows up as its own line rather than disappearing, and the AI spend sits alongside the rest of your plan's usage, not in a separate bill to reconcile.

Set a budget and watch it fill

A managed-spend workspace is not flying blind between invoices. You set a monthly budget on the AI spend, and the running figure fills against it in view, so the question "how much have we used this month" is answered on the page rather than reconstructed after the fact. As the figure approaches the limit the status turns over from neutral to a warning, so the signal reaches you before the month closes, not when the bill arrives. The budget is a line you watch, and the routing you already set is the lever you pull against it: send the premium model only to the skills that earn it and the figure rises slower, with no one having to remember to economize run by run.

add_photo_alternate

Screenshot to capture

The monthly AI-spend budget editor in the usage area of the Platform: a numeric monthly-limit field with a currency prefix, a smaller field for the warning threshold (set around 80 percent) with a caption that the status turns to a warning at this point, a read-only line showing the current month-to-date spend beneath the limit for context, and a primary 'Save budget' button. A muted helper line notes the budget watches managed spend and does not cap or stop runs. Dark theme, the warning threshold rendered in amber, the save action in cyan.

save as: public/docs-media/ai-budget-editor.png

Caption when added: Setting the monthly budget: the limit, and the point the status turns to a warning, so the signal reaches you before the month closes.

add_photo_alternate

Screenshot to capture

The usage view showing monthly AI spend: the running spend figure against the plan's included amount, a breakdown by model and by person (and by project and cost center), a budget with its percent-used status that turns from neutral to warning as it fills, and a separate unpriced-usage line. Dark theme.

save as: public/docs-media/ai-spend-usage.png

Caption when added: Managed AI spend, metered and reported against your plan, broken down the way a finance owner needs it.

Either way, routing by model is how you hold the number down: keep the premium model on the skills that earn it and send routine work to the cheaper one, and the spend follows the pins you set rather than a habit anyone has to remember. The full metering model, the dimensions it breaks down by and how unpriced usage is handled, lives on the AI models concept.

Keys stay server-side

Across every runtime, the credential is the platform's to hold, not the agent's. A model-provider key, a Copilot token, a Codex session: each is stored server-side and applied when the platform calls the provider, and every sandbox host blocks it from entering the container at launch. An agent produces work on a model without ever seeing what authenticates to it, which is the same principle that governs approved actions and MCP credentials across the platform.

Why this works the way it does

Tying a product to one AI vendor is a bet that one lab stays ahead at everything forever, and that is not how the models have moved. Disco Parrot treats the runtime as a setting so you are never holding that bet: run on the spend bundled into your subscription, bring the providers you want with your own accounts and billing, or mix the two, turn on the models you trust, and route each kind of work to the model that suits it.

For your team, that means running the model you trust for the work in front of you, set once on the skills and profiles that do each job rather than chosen by hand every run. For an engineer, the standout is the cross-provider check: let one provider write a change and a different one review it, so the work is graded by a model that did not produce it. For an admin, the choice of managed spend or your own keys, the enabled-model allowlist, and the two default switches are the levers you actually pull: start the same day on bundled spend or wire your own billing, leave a costly model off so no one can reach it, and let the per-skill pins spend the premium model only where it earns its keep. For a security owner, the credential never reaches the agent on any runtime, managed or your own, which is the same line the rest of the platform holds.

memory

AI models concept

The runtimes, the cascade in full, and how spend is metered.

edit_note

Skills

Pin a runtime or model to a kind of work.

verified_user

Approved actions

Why a key never reaches the agent.