Document providers
Bring files from Google Drive and OneDrive into the workspace document library with your own read-only connection, so the agent reads them as context. How the connection works, what comes across on import, and where the files land.
The files that explain a piece of work rarely start in Disco Parrot. The customer's format spec is in someone's Google Drive, the design doc is in OneDrive, the complaint that kicked everything off is a PDF in a shared folder. A document provider is how those files come in: you connect Google Drive or OneDrive with your own read-only access, search it, and pull the file you need into the workspace document library, where the agent reads it as part of the work.
This page is the how-to for the connection itself: how you authorize a provider, what crosses the wire when you import, and how imports stay clean. For what a document is once it is in the library, how you attach it, and how the agent reads it, see documents. For where document providers sit among the other ways Disco Parrot connects to your stack, see integrations and providers.
A document provider brings files in. It is not a sync, and it is not a place the platform writes back to. Importing a file copies it into your workspace library as its own record; your Google Drive or OneDrive is untouched, and nothing the team does to the workspace copy travels back to the original. That one-directional shape is what makes a provider safe to connect with read-only access and your own account.
Connect your account
You connect a provider from the place you would use it: when you go to add a document to an initiative, plan, or bug, the attach drawer offers Google Drive and OneDrive next to uploading from your machine, and the first time you reach for one it shows a Connect button instead of a file list. There is no separate settings page to hunt for first: the connection lives where the import happens.
Connecting opens the provider's own consent popup, and what it asks for is narrow on purpose: read-only access to the files you can already see. Google Drive requests the drive.readonly scope; OneDrive requests Files.Read.All (plus the sign-in and offline-refresh scopes Microsoft pairs with it). OneDrive consent works against a work, school, or personal Microsoft account, so a person whose files live in a personal account and one whose files live in their company tenant each connect the account their files are actually in. Either way the flow uses PKCE, so the authorization never depends on a long-lived secret sitting on your machine. If you walk away mid-popup, the half-finished consent state expires within ten minutes rather than lingering.
Once you consent, the connection is stored against your own account and refreshes itself on the next use. It is one connection per provider that travels with you: connect Google Drive once and it is there in every workspace you work in, rather than something to set up again each place. The two providers are independent, so connect Google Drive, OneDrive, or both, and each holds its own consent and reads its own files. Each person connects their own account; the platform never reaches your files through a shared service account, so what you can import is exactly what you can already open.
Search and import
With a provider connected, picking it opens a search box over your files. You type, the provider returns the matches it can see, and you pick the one you want. Picking it is the import: the platform reads the file's metadata and its content at that moment, stores the content in your workspace's own storage as the document's first version, and the file is now a record in your library, ready to attach.
The search reaches across the files that account can see and returns the closest fifty matches, freshest first, so a precise query lands the file faster than a broad one. On Google Drive that is your Drive and the files shared with you; on OneDrive it is the drive of the account you connected, work, school, or personal, rather than a SharePoint site or a teammate's drive. When the file you want is not in the first fifty, tightening the query is the move, the same way you would narrow a search in Drive or OneDrive directly.
Importing the same file twice does not litter your library. Within a workspace, an import is deduplicated by content: pull the same Drive file again and, if its content has not changed since you last imported it, the platform hands you back the document you already have rather than making a second copy. The dedup is per workspace, so the same file imported into two separate workspaces lands once in each, each with its own record and history. The library stays a set of distinct documents, not a pile of near-duplicates.
The import is a snapshot, not a subscription. The bytes that land are the file as it read at pick time, and they do not follow the original if it changes in Drive or OneDrive later. When the source moves on and you want the workspace copy to catch up, you import the same file again: the content has changed, so dedup steps aside and the platform stores the fresh copy as the document's next version, with the earlier one still readable. Picking is how a file comes in, and picking again is how it refreshes; the version history that results lives under documents.
If an import does not complete, say the connection drops mid-pick, the library does not keep a stub. A first import that fails partway is cleaned up, so no empty document is left sitting in the library, and a re-import that fails leaves the document you already had, and its version history, exactly as it was. Either way you retry the pick and there is nothing to tidy up by hand.
When a connection needs a fresh consent
A connection keeps itself alive without you thinking about it. The access grant Google or Microsoft hands back is short-lived by design, paired with a longer-lived refresh that the platform spends on your behalf the next time you search, so a connection you set up weeks ago is ready the moment you reach for it again. You do not re-consent on a schedule, and there is no token for you to rotate.
When that quiet refresh cannot go through, the connection reads as needing your attention rather than failing in place. The usual cause is the one you would expect: you revoked Disco Parrot from your Google or Microsoft account, your account changed, or the provider expired the grant on its own clock. The provider shows the connection as no longer ready and offers the same Connect step you used the first time. One fresh consent puts it back, and nothing you already imported is affected while the connection waits.
This is the same shape as the read-only consent itself: the connection only ever reflects access you currently have, so the day that access changes on Google's or Microsoft's side, the connection reflects it on the next use rather than holding a grant that outlived its welcome.
What comes across
What lands in your library depends on the kind of file, and the one case worth knowing before you import is Google's own document types.
- Google Docs, Sheets, and Slides are exported to PDF on the way in. Google's editor formats do not exist as portable files, so the platform asks Google for the PDF rendering and stores that. The page, the table, the slide deck all arrive as a clean PDF the agent can read; what does not come along is the live editor behavior behind them.
- OneDrive files import as themselves. A Word document, an Excel sheet, or a PDF in OneDrive lands in the library in its own format, unchanged.
- Everything else, on either provider, imports as-is. A PDF, an image, a CSV, a zip; the bytes you picked are the bytes that land.
Text files arrive ready for the agent to read directly. Binary files (a PDF, an image, an archive) are stored whole and made readable to the agent through the library's materialization, so it can decode the content when the work calls for it. You can also bring in a file that lives at a URL as a reference: the platform keeps the link rather than a copy, and the document renders as a pointer to its source.
What the record remembers
Alongside the bytes, the import keeps a thin record of where the file came from: its name, its type, its size, a link back to the original in Drive or OneDrive, and a content fingerprint the platform computes from the bytes themselves. That fingerprint is what lets a second import of an unchanged file hand you back the document you already have rather than a copy, and a changed file add a version rather than a duplicate. The record also stamps who imported it and when, which is the line the audit trail reads from. What it does not keep is anything about the source beyond the file you picked: not your folder tree, not your other files, not who else can see the original.
A cloud import is also not bound by the size limit that applies to a file you upload from your own machine. It streams straight from the provider into your workspace storage, so a large fixture set or a multi-megabyte export comes in the same way a small spec does, just a moment slower.
A newly imported document takes its visibility from a workspace setting, and the shipped default is Tenant, so an import is shared with the workspace unless your admins set the default to private or you change it on the document. The choice lives on the document record, not on where it is attached, which is why flipping one document to shared updates it everywhere at once. The full visibility model is covered under documents.
Bringing a spec in, start to finish
The customer's CSV format spec lives in Sarah's Google Drive, where the customer-success team dropped it. She is shaping the CSV Export initiative and wants that spec in the room for every conversation about the work, not pasted into one chat someone opens today and forgotten by the one someone opens next week.
She opens the initiative's Documents tab, clicks Add Document, and picks Google Drive. The first time, it shows a Connect button rather than a file list. She clicks it, Google's consent popup opens asking for read-only access to her Drive, she approves, and the panel turns into a search box over her own files. She types "csv format", picks CSV format spec from the results, and that pick is the import: the platform reads the file at that moment, renders the Google Doc to PDF, and stores it as the document's first version in the workspace library, attached to the initiative.
The next morning Tom, the engineer on the implementation plan, opens Continue in Chat. He never touched Drive and never connected an account of his own. The spec Sarah imported is already a workspace record, materialized into his sandbox at session start, so the agent answers his first question from the customer's actual format contract rather than from a guess. Sarah connected her own Drive once and pulled one file; the whole team works from it from then on.
Where the file goes, and how the agent reads it
An imported file is a document like any other the moment it lands. It joins the workspace library, you attach it to the initiative, plan, or bug it belongs with, and the attachment propagates up the work hierarchy so the document surfaces everywhere it is useful. At the start of every chat and flow, the platform materializes the attached documents into the sandbox, so the spec you pulled from Drive is in the agent's working context without anyone pasting it in.
This is what the connection buys you. The customer's format spec stops being a link someone has to remember and becomes context the agent reads on every run. The full story of the library, attachments, propagation, and how the agent uses them lives under documents; this page is the front door those files come through.
Your connection, and stepping away from it
The connection is yours, and it stays useful exactly as long as your access does. Because the platform never holds a shared service account, the day your access to a file changes on Google's or Microsoft's side, the connection reflects it; you only ever reach what you could already reach.
When you want to cut a provider loose, you revoke Disco Parrot's access from your own Google or Microsoft account settings, the same place you manage every app you have authorized. The connection stops working on the next use, and that is the clean break.
The documents you already imported do not go anywhere. They were copied into the workspace library at import time and never depended on a live link back, so disconnecting a source does not pull the files you already brought in.
Permissions
Connecting a provider and browsing your files goes with the doc-providers.read scope; importing a file you picked goes with doc-providers.import. The split keeps the act of looking separate from the act of bringing a file into the shared library. Reading and managing the documents themselves once they are in the library are governed by documents.read and documents.manage, covered under documents.
Because the credential is per-user, there is no workspace-level provider for an admin to wire. Each person connects their own Drive or OneDrive when they need it, and the audit trail records the import against the person who made it, so a document that came in from a connected provider carries the answer to who brought it and when.
Why document providers work this way
The tempting way to connect a file store is to wire it once, with a service account that can see everything, and let the platform pull whatever it wants. That is also the version that makes a security review nervous, because the connection outlives the people and reaches further than any one person should. Disco Parrot connects the other way: per-person, read-only, one-directional, and resolved against the access you already have. The connection cannot see a file you cannot, it cannot write back to the source, and it cannot keep working after you revoke it. What you get for that restraint is a library the team can trust, where every document came in through a named person's own access and stays put once it is in.
For a planner, a document provider is how the context that lives in Drive or OneDrive stops being a link you paste into a chat and becomes a document the agent reads on every run. You bring the customer's spec in once, and every conversation about that work starts with it in the room.
For an engineer, importing is search-and-pick, not a setup chore. You connect your own account read-only the first time you reach for a file, find what you need, and it lands in the library attached to the work, with Google's editor docs arriving as PDFs the agent can read. There is no workspace provider to wait on an admin for.
For a lead, the per-person model means the team's document sources do not become a shared credential nobody owns. Each person brings their own access, the workspace keeps its own copies, and the trail shows who imported what.
For the person who has to sign off on connecting a file store, the boundary is the strongest part: read-only consent, PKCE so no long-lived secret rides along, per-user access that can never exceed the person behind it, a strictly one-way data path with no write-back, and a clean revoke from the provider's own account settings. Connecting a document provider adds a way to bring a file in, and nothing else.
What a document is once it lands: the library, attachments, propagation, and how the agent reads it.
The concept: why a document provider is a different category from an integration or a repository provider.
The honest matrix of everything Disco Parrot connects to right now.