@elizaos/plugin-knowledge
package provides Retrieval Augmented Generation (RAG) capabilities for elizaOS agents. It enables agents to store, search, and automatically use knowledge from uploaded documents and text.
Key Features
- Multi-format Support: Process PDFs, Word docs, text files, and more
- Smart Deduplication: Content-based IDs prevent duplicate entries
- Automatic RAG: Knowledge is automatically injected into relevant conversations
- Character Knowledge: Load knowledge from character definitions
- REST API: Manage documents via HTTP endpoints
- Conversation Tracking: Track which knowledge was used in responses
Architecture Overview
Core Components
Knowledge Service
The main service class that handles all knowledge operations:Document Processing
The service handles different file types with sophisticated processing logic:Actions
The plugin provides two main actions:PROCESS_KNOWLEDGE
Adds knowledge from files or text content:- Supports file paths:
/path/to/document.pdf
- Direct text: “Add this to your knowledge: …”
- File types: PDF, DOCX, TXT, MD, CSV, etc.
- Automatically splits content into searchable fragments
SEARCH_KNOWLEDGE
Explicitly searches the knowledge base:- Triggered by: “Search your knowledge for…”
- Returns top 3 most relevant results
- Displays formatted text snippets
Knowledge Provider
Automatically injects relevant knowledge into agent responses:- Dynamic: Runs on every message to find relevant context
- Top 5 Results: Retrieves up to 5 most relevant knowledge fragments
- RAG Tracking: Enriches conversation memories with knowledge usage metadata
- Token Limit: Caps knowledge at ~4000 tokens to prevent context overflow
- Searches for relevant knowledge based on the user’s message
- Formats it with a ”# Knowledge” header
- Tracks which knowledge was used in the response
- Enriches the conversation memory with RAG metadata
Document Processing Pipeline
1. Document Ingestion
Knowledge can be added through multiple channels:2. Text Extraction
Supports multiple file formats:3. Content-Based Deduplication
Uses deterministic IDs to prevent duplicates:4. Intelligent Chunking
Content-aware text splitting:5. Contextual Enrichment
Optional feature for better retrieval:6. Embedding Generation
Create vector embeddings:7. Storage
Documents and embeddings are stored separately:Retrieval & RAG
Semantic Search
Find relevant knowledge using vector similarity:API Reference
REST Endpoints
Upload Documents
List Documents
Delete Document
Search Knowledge
TypeScript Interfaces
Advanced Features
Contextual Embeddings
Enable for 50% better retrieval accuracy:- Adds document context to each chunk
- Improves semantic understanding
- Reduces false positives
- Enables better cross-reference retrieval
Document Caching
With OpenRouter, enable caching for 90% cost reduction:Custom Document Processors
Extend for special formats:Performance Optimization
Rate Limiting
Batch Processing
Memory Management
Integration Patterns
Basic Integration
Configuration Options
Using the Service
Best Practices
1
Use clear, descriptive filenames
Choose names that clearly indicate the content (e.g.,
product-guide-v2.pdf
instead of doc1.pdf
)2
Group related documents in folders
Create logical folder structures like
products/
, support/
, policies/
3
Tag documents with metadata
Add categories, dates, and versions to improve searchability
4
Keep individual documents focused
One topic per document for better retrieval accuracy
Troubleshooting
Common Issues
Documents Not Loading
Check file permissions and paths:
Poor Retrieval Quality
Try adjusting chunk size and overlap:
Rate Limiting Errors
Implement exponential backoff:
Debug Logging
Enable verbose logging:Summary
The Knowledge Plugin provides a complete RAG system that:- Processes Documents: Handles PDFs, Word docs, text files, and more with automatic text extraction
- Manages Deduplication: Uses content-based IDs to prevent duplicate knowledge entries
- Chunks Intelligently: Splits documents into searchable fragments with configurable overlap
- Retrieves Semantically: Finds relevant knowledge using vector similarity search
- Enhances Conversations: Automatically injects relevant knowledge into agent responses
- Tracks Usage: Records which knowledge was used in each conversation
- Automatic document loading on startup
- Character knowledge integration
- RAG metadata tracking for conversation history
- REST API for document management
- Support for contextual embeddings
- Provider-agnostic embedding support