Theme
External Document Sources
Instead of manually uploading files, you can connect an external storage provider and have documents synced into your Knowledge Base automatically. This keeps your KB content up-to-date as files change in your organization's storage systems.
Supported Providers
| Provider | Authentication | Typical Use Case |
|---|---|---|
| SharePoint | OAuth 2.0 (Microsoft Graph) | Corporate document libraries and intranet content |
| Google Drive | OAuth 2.0 (Google API) | Shared drives and team folders |
| Amazon S3 | Access Key + Secret Key | Document archives and data lake exports |
| Dropbox | OAuth 2.0 | Team file sharing and collaboration |
| OneDrive | OAuth 2.0 (Microsoft Graph) | Personal and shared business files |
| Box | OAuth 2.0 (Box API) | Enterprise content management |
| Azure Blob Storage | Connection String or SAS Token | Cloud-native document storage |
Prerequisites
Before connecting an external source to a Knowledge Base, you must create a storage integration in your tenant settings:
- Navigate to Settings > Integrations.
- Click Add Integration and select the Storage category.
- Choose your provider (e.g., SharePoint, Google Drive).
- Enter the required credentials for your provider.
- Click Test Connection to verify access.
- Save the integration.
WARNING
Storage integration credentials are encrypted at rest using your tenant's KMS key. However, ensure that the service account or OAuth token you provide has read-only access scoped to only the folders you intend to sync. Never use administrative credentials with broad write permissions.
Adding an External Source
- Open the Knowledge Base detail view.
- Navigate to the External Sources tab.
- Click Add Source.
- Select the storage integration you configured in Settings.
- Browse or enter the root folder path -- the folder in the external system that contains the documents you want to sync.
- Configure file filters and sync schedule (see below).
- Click Save Source.
External source configuration form showing integration selector, root folder path browser, file extension filters, and sync schedule dropdown
The first sync begins immediately after saving. Subsequent syncs follow the schedule you configure.
File Filters
File filters let you control which documents are pulled from the external source. This prevents irrelevant or oversized files from cluttering your Knowledge Base.
| Filter | Description | Example |
|---|---|---|
| File Extensions | Only sync files with these extensions | .pdf, .docx, .md |
| Include Paths | Only sync files within these subfolder paths | /policies, /product-docs |
| Exclude Paths | Skip files within these subfolder paths | /archive, /drafts |
| Max File Size | Skip files larger than this threshold | 50 MB |
TIP
Start with a narrow set of extensions (e.g., .pdf, .docx, .md) and a specific subfolder path. You can always broaden the filters later once you have confirmed that the sync produces good results.
Sync Schedules
| Schedule | Behavior |
|---|---|
| Manual | Sync only when you click the Sync Now button. |
| Hourly | Runs every 60 minutes. Best for frequently changing content. |
| Daily | Runs once per day at midnight UTC. Suitable for most use cases. |
| Weekly | Runs every Sunday at midnight UTC. Good for stable, infrequently updated document sets. |
You can trigger a manual sync at any time regardless of the configured schedule by clicking Sync Now on the source row.
Monitoring Sync Status
The External Sources tab displays the following for each connected source:
| Column | Description |
|---|---|
| Provider | The storage provider type and integration name |
| Root Path | The folder being synced |
| Schedule | The configured sync frequency |
| Last Sync | Timestamp of the most recent completed sync |
| Files Synced | Total number of files currently indexed from this source |
| Status | Syncing, Idle, or Error |
Click a source row to view detailed sync history, including per-file status and any errors encountered.
External Sources tab showing source rows with provider icon, root path, schedule, last sync timestamp, files synced count, and status badge
URL Mapping
When documents are synced from an external source, the platform stores a URL mapping that links each chunk back to the original file location in the provider. This means that when the bot retrieves a passage, it can include a link to the source document in the response -- allowing end users to access the full document directly.
URL mappings are generated automatically based on the provider's file URLs (e.g., SharePoint document links, Google Drive share URLs, S3 pre-signed URLs).
Updating and Removing Sources
- Edit -- Click the edit icon on a source row to change filters, schedule, or root path. Changes take effect on the next sync.
- Delete -- Click the delete icon to remove a source. All documents that were synced from this source will be deleted from the Knowledge Base, including their chunks and embeddings. This action cannot be undone.
WARNING
Deleting an external source removes all of its synced documents immediately. If the Knowledge Base is actively used by a bot, this may cause the bot to lose access to content it was previously able to reference. Verify that the source is no longer needed before deleting.
