Supported Types
In the context of DocuChat, a document is any source of information that can be used by your AI chatbots to answer questions. Documents come in various types, including:
Files
- Text Files: PDF, Word documents (DOCX), Excel spreadsheets (XLSX), PowerPoint presentations (PPTX), CSV files, E-books (EPUB), plain text (TXT), Markdown files (MD)
- Audio Files: MP3, WAV, M4A, MPGA
- Video Files: MP4, MPEG, WEBM
Websites
DocuChat can extract content from websites by adding web page URLs. Simply paste URLs (one per line) and DocuChat will automatically extract the relevant content from those pages to train your chatbot. This works for most modern websites.
You can enable the Include all webpages linked from these websites toggle to automatically discover all pages linked from your provided URLs. DocuChat uses an advanced web crawler to find linked pages and gives you full control over which ones to include — you can select or deselect individual pages, search through discovered pages, and optionally use AI-powered filtering to automatically remove irrelevant pages like login screens, legal notices, and navigation pages.
For a detailed guide on website crawling and page discovery, see Website Crawling.
YouTube Videos
DocuChat can process YouTube videos that have English captions available as a source of information. The captions will be used as the video's content.
Simply paste a YouTube video URL and DocuChat will process it if English captions are available.
Apps
Google Drive
When you select folders from your Google Drive, each folder becomes a single document in your library. DocuChat syncs all supported file types within those folders, including any subfolders.
When you add a Google Drive folder document to a chatbot, the chatbot gains access to all files inside that folder. DocuChat automatically syncs any modifications you make to those folders in Google Drive every hour, ensuring your chatbots always have access to the most up-to-date content.
How to Connect Google Drive
- Connect your Google Drive: Click the Google Drive option and authorize DocuChat to access your Google Drive
- Click "Add From Library": After connecting, click "Add From Library" to select your Google Drive folder
- Select Documents or Folders: Choose the Documents or Folders you want to sync (you can select multiple folders by holding Ctrl/Cmd)
- Your Drive will now show as a Source Document: The selected folders will appear as documents in your library and can be used by your chatbots