Supported Types
In the context of DocuChat, a document
is any source of information that can be used by your AI chatbots to answer questions. Documents come in various types, including:
Files
- Text Files: PDF, Word documents (DOCX), Excel spreadsheets (XLSX), PowerPoint presentations (PPTX), CSV files, E-books (EPUB), plain text (TXT), Markdown files (MD)
- Audio Files: MP3, WAV, M4A, MPGA
- Video Files: MP4, MPEG, WEBM
Websites
DocuChat can extract content from websites by adding web page URLs. Simply paste URLs (one per line) and DocuChat will automatically extract the relevant content from those pages to train your chatbot. This works for most modern websites.
You can enable the Include all webpages linked from these websites
toggle to automatically discover and extract content from pages that are linked from your provided URLs. This works for both regular web pages and sitemap URLs (e.g., example.com/sitemap.xml), allowing you to easily add content from larger websites.
YouTube Videos
DocuChat can process YouTube videos that have English captions available as a source of information. The captions will be used as the video's content.
Simply paste a YouTube video URL and DocuChat will process it if English captions are available.
Apps
Google Drive
When you select folders from your Google Drive, each folder becomes a single document in your library. DocuChat syncs all supported file types within those folders, including any subfolders.
When you add a Google Drive folder document to a chatbot, the chatbot gains access to all files inside that folder. DocuChat automatically syncs any modifications you make to those folders in Google Drive every hour, ensuring your chatbots always have access to the most up-to-date content.