External Object Storage

Overview

External object storage integration improves knowledge base accessibility by allowing users to connect their repositories directly. This enables straightforward data import, supporting a smoother and more efficient RAG process within the platform.

Available providers

Right now, we support four external object storage services:

  1. AWS S3

  2. Google Cloud Storage

  3. Azure Blob Storage

  4. Dropbox

Connect external object storage

  1. Go to Workspace settings.

  2. Navigate to External object storage section, then click Add external object storage.

Add external object storage
  1. Choose a service and fill in the credentials.

List of object storage

For more detailed guides on connecting to each external object storage, you can refer to this documentation:

Connecting external object storage to knowledge base

  1. Make sure you already connect the external object storage from the workspace settings.

  2. Open your knowledge base, click the more menu next to Upload files button and select Connect object storage.

  3. Select the object storage.

  4. After you click your desired external object storage, a dialog will show. By default, all supported files will be added to the knowledge base.

  5. You can filter the files by writing Rules, then click Test rules. When the document list already matches your preference, click Connect object storage.

  6. Once the process is finsihed, you can view your files within your knowledge base.


Rules in external object storage

Rules help import specific files from external object storage into the knowledge base. Using glob patterns, you can include or exclude specific files.

How to write rules

Writing rules for file import

When setting up rules for importing files from your external object storage into the knowledge base, understanding how to effectively use Glob patterns is crucial. Here are steps and tips to guide you through the process:

Step 1: Identify file patterns

First, determine the common patterns in the names of files you wish to import or exclude. For example, if you want to import all .pdf files, your pattern would be *.pdf.

Step 2: Utilize wildcards

  • * (asterisk) matches zero or more characters. For instance, *.pdf matches all files ending in .pdf.

  • ? (question mark) matches exactly one character. For example, ?.pdf matches a.pdf but not ab.pdf.

Step 3: Specifying directories

If you want to specify files in a particular directory, include the directory name in your pattern. For example, myfolder/*.pdf matches all .pdf files in the myfolder directory.

Step 4: Excluding files

To exclude files, you can use the negation pattern !. For example, if you want to import all .pdf files except those starting with temp, your rules would include *.pdf and !temp*.pdf.

Step 5: Combining patterns

You can combine multiple patterns to fine-tune your selection. For example, to import .pdf and .docx files but exclude those in the drafts folder, use *.pdf, *.docs, and !drafts/*.

Example rules

  1. Import all .pdf files: *.pdf

  2. Import all files except those in the temp directory: *, !temp/*

  3. Import all .docx and .txt files in the data directory: data/*.docx, data/*.txt

Remember, the order of rules matters. Patterns defined later can override those defined earlier, so plan your rules accordingly to ensure the correct files are imported.

Last updated