Configuring Retrievers
Retrievers are configurations for RAG (Retrieval-Augmented Generation) functionality in Wegent. They define how documents are indexed, stored, and retrieved.
Prerequisitesβ
- Wegent platform installed and running
- Elasticsearch service enabled (optional, only needed for RAG features)
docker compose --profile rag up -d
What is a Retriever?β
A Retriever is a CRD (Custom Resource Definition) that configures:
- Storage Backend: Vector database connection (Elasticsearch, Qdrant)
- Index Strategy: How documents are organized in the database
- Retrieval Methods: Search modes (vector, keyword, hybrid)
- Embedding Configuration: How text is converted to vectors
Creating a Retrieverβ
Via Web UIβ
- Navigate to Settings β Retrievers
- Click Add Retriever
- Fill in the configuration:
- Name: Unique identifier (e.g.,
my-es-retriever) - Display Name: Human-readable name
- Storage Type: Select
elasticsearchorqdrant - URL: Storage backend URL (e.g.,
http://elasticsearch:9200) - Authentication: Username/password or API key (optional)
- Index Strategy: Choose indexing mode
- Retrieval Methods: Enable vector, keyword, or hybrid search
- Name: Unique identifier (e.g.,
- Click Test Connection to verify settings
- Click Create to save
Via APIβ
POST /api/retrievers
Content-Type: application/json
{
"apiVersion": "agent.wecode.io/v1",
"kind": "Retriever",
"metadata": {
"name": "my-es-retriever",
"namespace": "default",
"displayName": "My Elasticsearch Retriever"
},
"spec": {
"storageConfig": {
"type": "elasticsearch",
"url": "http://elasticsearch:9200",
"username": "elastic",
"password": "password",
"indexStrategy": {
"mode": "per_user",
"prefix": "wegent"
}
},
"retrievalMethods": {
"vector": {
"enabled": true,
"defaultWeight": 0.7
},
"keyword": {
"enabled": true,
"defaultWeight": 0.3
},
"hybrid": {
"enabled": true
}
},
"description": "Elasticsearch retriever for RAG"
}
}
Index Strategiesβ
Choose an index strategy based on your use case:
| Strategy | Description | Best For |
|---|---|---|
| per_user | One index per user | Elasticsearch deployments, user-level isolation |
| per_dataset | One index per knowledge base | Multi-tenant scenarios, dataset isolation |
| fixed | Single fixed index | Small datasets, simple setup |
| rolling | Hash-based sharding | Large datasets, load distribution |
Recommended: per_user Modeβ
For Elasticsearch, we recommend using per_user mode:
{
"indexStrategy": {
"mode": "per_user",
"prefix": "wegent"
}
}
This creates indices like wegent_user_123, providing better performance and isolation.
Retrieval Methodsβ
Vector Search (Semantic)β
Pure vector similarity search for semantic understanding:
| Parameter | Description | Default |
|---|---|---|
retrieval_mode | Retrieval mode | vector |
top_k | Number of results | 5 |
score_threshold | Relevance threshold | 0.7 |
Use cases: Concept matching, understanding questions, semantic search
Keyword Searchβ
Traditional BM25 keyword matching:
| Parameter | Description | Default |
|---|---|---|
retrieval_mode | Retrieval mode | keyword |
top_k | Number of results | 5 |
Use cases: Exact term matching, code search, API names
Hybrid Search (Vector + Keyword)β
Combines vector similarity with BM25 keyword matching:
| Parameter | Description | Default |
|---|---|---|
retrieval_mode | Retrieval mode | hybrid |
vector_weight | Vector weight | 0.7 |
keyword_weight | Keyword weight | 0.3 |
top_k | Number of results | 5 |
score_threshold | Relevance threshold | 0.7 |
Weight recommendations:
- Conceptual queries (0.8/0.2): Understanding, explanations
- Balanced (0.7/0.3): General purpose (default)
- Precise matching (0.3/0.7): Code search, API names, exact terms
Retrieval Testβ
Before saving retrieval configuration, you can use the retrieval test feature to verify effectiveness.
How to Useβ
- Go to Knowledge Base Retrieval Settings
- Configure retrieval parameters (mode, top_k, threshold, etc.)
- Enter a test query in the Retrieval Test area
- Click Test button
- Review returned document chunks and relevance scores
- Adjust parameters based on results
- Click Save when satisfied
Test Recommendationsβ
| Test Type | Suggested Query |
|---|---|
| Conceptual | Use descriptive questions like "What is..." |
| Precise | Use specific terms or names |
| Boundary | Use vague or unrelated queries |
Retrieval test helps you optimize retrieval configuration before actual use.
Using Retrievers with RAGβ
1. Upload and Index Documentsβ
POST /api/rag/documents/upload
Content-Type: multipart/form-data
- knowledge_id: "kb_001"
- retriever_name: "my-es-retriever"
- retriever_namespace: "default"
- file: <document.pdf>
- embedding_config: {
"provider": "openai",
"model": "text-embedding-3-small",
"api_key": "sk-..."
}
Supported file types: MD, PDF, TXT, DOCX, and code files
2. Retrieve Relevant Chunksβ
POST /api/rag/retrieve
Content-Type: application/json
{
"query": "How do I configure a bot?",
"knowledge_id": "kb_001",
"retriever_ref": {
"name": "my-es-retriever",
"namespace": "default"
},
"embedding_config": {
"provider": "openai",
"model": "text-embedding-3-small",
"api_key": "sk-..."
},
"top_k": 5,
"score_threshold": 0.7,
"retrieval_mode": "hybrid",
"hybrid_weights": {
"vector_weight": 0.7,
"keyword_weight": 0.3
}
}
3. Manage Documentsβ
# List documents
GET /api/rag/documents?knowledge_id=kb_001&retriever_name=my-es-retriever&page=1&page_size=20
# Get document details
GET /api/rag/documents/{doc_ref}?knowledge_id=kb_001&retriever_name=my-es-retriever
# Delete document
DELETE /api/rag/documents/{doc_ref}?knowledge_id=kb_001&retriever_name=my-es-retriever
Embedding Providersβ
OpenAIβ
{
"provider": "openai",
"model": "text-embedding-3-small",
"api_key": "sk-...",
"base_url": "https://api.openai.com/v1" // Optional
}
Custom API (OpenAI-compatible)β
{
"provider": "custom",
"model": "your-model-name",
"api_key": "your-api-key",
"base_url": "https://your-api-endpoint.com/v1"
}
Resource Scopesβ
Retrievers support three scopes:
| Scope | Description | Access |
|---|---|---|
| Personal | Your private retrievers | Only you |
| Group | Shared within a group | Group members |
| Public | System-provided retrievers | All users |
Best Practicesβ
1. Index Strategy Selectionβ
- Use
per_userfor Elasticsearch (recommended) - Use
per_datasetfor multi-tenant scenarios with dataset isolation - Avoid
fixedfor production (only suitable for small, single-tenant deployments)
2. Retrieval Mode Selectionβ
- Vector mode: Semantic understanding, concept matching
- Hybrid mode: Balanced semantic and exact matching (recommended for most use cases)
- Adjust weights: Based on query type (conceptual vs. precise)
3. Securityβ
- Store credentials securely (use environment variables or secrets management)
- Use API keys instead of username/password when possible
- Restrict access using namespaces and groups
4. Performanceβ
- Choose appropriate index strategy based on dataset size
- Monitor storage backend performance
- Use
per_usermode for Elasticsearch to avoid index explosion - Set appropriate
top_kandscore_thresholdvalues
5. Document Managementβ
- Use meaningful
knowledge_idvalues for organization - Regularly clean up unused documents
- Monitor storage usage
Troubleshootingβ
Connection Failedβ
Problem: Cannot connect to Elasticsearch
Solutions:
- Verify Elasticsearch is running:
docker ps | grep elasticsearch - Check URL is correct:
http://elasticsearch:9200(internal) orhttp://localhost:9200(external) - Test connection: Use the Test Connection button in the UI
- Check credentials: Verify username/password or API key
Indexing Failedβ
Problem: Document upload fails
Solutions:
- Check file format is supported (MD, PDF, TXT, DOCX, code files)
- Verify embedding provider credentials
- Check Elasticsearch storage capacity
- Review backend logs for detailed error messages
Low Retrieval Qualityβ
Problem: Retrieved chunks are not relevant
Solutions:
- Try hybrid mode instead of pure vector mode
- Adjust hybrid weights based on query type
- Lower
score_thresholdto get more results - Use a better embedding model (e.g.,
text-embedding-3-large) - Improve document chunking (automatic semantic chunking is used)
API Referenceβ
For complete API documentation, see:
- Backend API docs:
http://localhost:8000/api/docs - AGENTS.md: RAG Services section
Using Knowledge Base Without RAG (No Retriever Mode)β
You can create and use knowledge bases even without configuring a retriever. In this mode, the AI uses exploration tools instead of semantic search.
What Works Without RAGβ
- β Document upload and storage
- β Document viewing and editing
- β
AI can browse documents using
kb_ls(list) andkb_head(read) tools - β Manual document exploration by AI
- β Knowledge base chat in notebook mode
What Requires RAG Configurationβ
- β Semantic search (
knowledge_base_searchtool) - β Vector similarity retrieval
- β Automatic chunk-based retrieval
- β Hybrid search (vector + keyword)
When to Use No-RAG Modeβ
Consider using knowledge bases without RAG when:
- No Vector Database Available: You don't have Elasticsearch or other vector database set up
- Small Knowledge Base: Your knowledge base is small enough for AI to read through documents
- Testing: You want to test knowledge base functionality without RAG infrastructure
- Cost Optimization: You want to avoid embedding model API costs
AI Behavior in No-RAG Modeβ
When you chat with an AI that has access to a knowledge base without RAG:
- Document Discovery: AI uses
kb_lsto list available documents with summaries - Content Selection: AI reviews document summaries to identify relevant ones
- Content Reading: AI uses
kb_headto read document content (with pagination for large files) - Answer Generation: AI answers based on the content it has read
Example Workflowβ
User: What does the API documentation say about authentication?
AI: Let me explore the knowledge base to find relevant information.
[Uses kb_ls to list documents]
Found 5 documents:
- api-guide.md (15KB) - API usage guide with authentication section
- setup.md (8KB) - Initial setup instructions
- ...
[Uses kb_head to read api-guide.md]
Reading authentication section from api-guide.md...
Based on the API documentation, authentication uses JWT tokens...
Performance Considerationsβ
This approach is less efficient than RAG retrieval:
| Aspect | RAG Mode | No-RAG Mode |
|---|---|---|
| Search Speed | Fast (vector similarity) | Slower (sequential reading) |
| Token Usage | Lower (relevant chunks only) | Higher (may read full documents) |
| Accuracy | Semantic understanding | Depends on document summaries |
| Best For | Large knowledge bases | Small knowledge bases (<50 docs) |
Setting Up No-RAG Modeβ
- Create Knowledge Base: In the create dialog, skip the retrieval configuration section
- Upload Documents: Documents are stored but not indexed for RAG
- Start Chatting: AI will automatically use exploration tools
Note: You can always add RAG configuration later by editing the knowledge base settings after configuring a retriever.
Related Documentationβ
Feedbackβ
Since RAG functionality is experimental, we welcome your feedback:
- Report issues on GitHub
- Suggest improvements
- Share your use cases
Note: This feature is under active development. Check the changelog for updates.