Configuring Retrievers
Retrievers are configurations for RAG (Retrieval-Augmented Generation) functionality in Wegent. They define how documents are indexed, stored, and retrieved.
Prerequisitesβ
- Wegent platform installed and running
- Elasticsearch service enabled (optional, only needed for RAG features)
docker compose --profile rag up -d
What is a Retriever?β
A Retriever is a CRD (Custom Resource Definition) that configures:
- Storage Backend: Vector database connection (Elasticsearch, Qdrant)
- Index Strategy: How documents are organized in the database
- Retrieval Methods: Search modes (vector, keyword, hybrid)
- Embedding Configuration: How text is converted to vectors
Creating a Retrieverβ
Via Web UIβ
- Navigate to Settings β Retrievers
- Click Add Retriever
- Fill in the configuration:
- Name: Unique identifier (e.g.,
my-es-retriever) - Display Name: Human-readable name
- Storage Type: Select
elasticsearchorqdrant - URL: Storage backend URL (e.g.,
http://elasticsearch:9200) - Authentication: Username/password or API key (optional)
- Index Strategy: Choose indexing mode
- Retrieval Methods: Enable vector, keyword, or hybrid search
- Name: Unique identifier (e.g.,
- Click Test Connection to verify settings
- Click Create to save
Via APIβ
POST /api/retrievers
Content-Type: application/json
{
"apiVersion": "agent.wecode.io/v1",
"kind": "Retriever",
"metadata": {
"name": "my-es-retriever",
"namespace": "default",
"displayName": "My Elasticsearch Retriever"
},
"spec": {
"storageConfig": {
"type": "elasticsearch",
"url": "http://elasticsearch:9200",
"username": "elastic",
"password": "password",
"indexStrategy": {
"mode": "per_user",
"prefix": "wegent"
}
},
"retrievalMethods": {
"vector": {
"enabled": true,
"defaultWeight": 0.7
},
"keyword": {
"enabled": true,
"defaultWeight": 0.3
},
"hybrid": {
"enabled": true
}
},
"description": "Elasticsearch retriever for RAG"
}
}
Index Strategiesβ
Choose an index strategy based on your use case:
| Strategy | Description | Best For |
|---|---|---|
| per_user | One index per user | Elasticsearch deployments, user-level isolation |
| per_dataset | One index per knowledge base | Multi-tenant scenarios, dataset isolation |
| fixed | Single fixed index | Small datasets, simple setup |
| rolling | Hash-based sharding | Large datasets, load distribution |
Recommended: per_user Modeβ
For Elasticsearch, we recommend using per_user mode:
{
"indexStrategy": {
"mode": "per_user",
"prefix": "wegent"
}
}
This creates indices like wegent_user_123, providing better performance and isolation.
Retrieval Methodsβ
Vector Search (Semantic)β
Pure vector similarity search for semantic understanding:
| Parameter | Description | Default |
|---|---|---|
retrieval_mode | Retrieval mode | vector |
top_k | Number of results | 5 |
score_threshold | Relevance threshold | 0.5 |
Use cases: Concept matching, understanding questions, semantic search
Keyword Searchβ
Traditional BM25 keyword matching:
| Parameter | Description | Default |
|---|---|---|
retrieval_mode | Retrieval mode | keyword |
top_k | Number of results | 5 |
Use cases: Exact term matching, code search, API names
Hybrid Search (Vector + Keyword)β
Combines vector similarity with BM25 keyword matching:
| Parameter | Description | Default |
|---|---|---|
retrieval_mode | Retrieval mode | hybrid |
vector_weight | Vector weight | 0.7 |
keyword_weight | Keyword weight | 0.3 |
top_k | Number of results | 5 |
score_threshold | Relevance threshold | 0.5 |
Weight recommendations:
- Conceptual queries (0.8/0.2): Understanding, explanations
- Balanced (0.7/0.3): General purpose (default)
- Precise matching (0.3/0.7): Code search, API names, exact terms
Retrieval Testβ
Before saving retrieval configuration, you can use the retrieval test feature to verify effectiveness.
How to Useβ
- Go to Knowledge Base Retrieval Settings
- Configure retrieval parameters (mode, top_k, threshold, etc.)
- Enter a test query in the Retrieval Test area
- Click Test button
- Review returned document chunks and relevance scores
- Adjust parameters based on results
- Click Save when satisfied
Test Recommendationsβ
| Test Type | Suggested Query |
|---|---|
| Conceptual | Use descriptive questions like "What is..." |
| Precise | Use specific terms or names |
| Boundary | Use vague or unrelated queries |
Retrieval test helps you optimize retrieval configuration before actual use.
Using Retrievers with RAGβ
1. Retrieve Relevant Chunksβ
POST /api/rag/retrieve
Content-Type: application/json
{
"query": "How do I configure a bot?",
"knowledge_id": "kb_001",
"retriever_ref": {
"name": "my-es-retriever",
"namespace": "default"
},
"embedding_config": {
"provider": "openai",
"model": "text-embedding-3-small",
"api_key": "sk-..."
},
"top_k": 5,
"score_threshold": 0.5,
"retrieval_mode": "hybrid",
"hybrid_weights": {
"vector_weight": 0.7,
"keyword_weight": 0.3
}
}
Embedding Providersβ
OpenAIβ
{
"provider": "openai",
"model": "text-embedding-3-small",
"api_key": "sk-...",
"base_url": "https://api.openai.com/v1" // Optional
}
Custom API (OpenAI-compatible)β
{
"provider": "custom",
"model": "your-model-name",
"api_key": "your-api-key",
"base_url": "https://your-api-endpoint.com/v1"
}
Resource Scopesβ
Retrievers support three scopes:
| Scope | Description | Access |
|---|---|---|
| Personal | Your private retrievers | Only you |
| Group | Shared within a group | Group members |
| Public | System-provided retrievers | All users |
Best Practicesβ
1. Index Strategy Selectionβ
- Use
per_userfor Elasticsearch (recommended) - Use
per_datasetfor multi-tenant scenarios with dataset isolation - Avoid
fixedfor production (only suitable for small, single-tenant deployments)
2. Retrieval Mode Selectionβ
- Vector mode: Semantic understanding, concept matching
- Hybrid mode: Balanced semantic and exact matching (recommended for most use cases)
- Adjust weights: Based on query type (conceptual vs. precise)
3. Securityβ
- Store credentials securely (use environment variables or secrets management)
- Use API keys instead of username/password when possible
- Restrict access using namespaces and groups
4. Performanceβ
- Choose appropriate index strategy based on dataset size
- Monitor storage backend performance
- Use
per_usermode for Elasticsearch to avoid index explosion - Set appropriate
top_kandscore_thresholdvalues
5. Document Managementβ
- Use meaningful
knowledge_idvalues for organization - Regularly clean up unused documents
- Monitor storage usage
Troubleshootingβ
Connection Failedβ
Problem: Cannot connect to Elasticsearch
Solutions:
- Verify Elasticsearch is running:
docker ps | grep elasticsearch - Check URL is correct:
http://elasticsearch:9200(internal) orhttp://localhost:9200(external) - Test connection: Use the Test Connection button in the UI
- Check credentials: Verify username/password or API key
Indexing Failedβ
Problem: Document upload fails
Solutions:
- Check file format is supported (MD, PDF, TXT, DOCX, code files)
- Verify embedding provider credentials
- Check Elasticsearch storage capacity
- Review backend logs for detailed error messages
Low Retrieval Qualityβ
Problem: Retrieved chunks are not relevant
Solutions:
- Try hybrid mode instead of pure vector mode
- Adjust hybrid weights based on query type
- Lower
score_thresholdto get more results - Use a better embedding model (e.g.,
text-embedding-3-large) - Improve document chunking (automatic semantic chunking is used)
API Referenceβ
For complete API documentation, see:
- Backend API docs:
http://localhost:8000/api/docs - AGENTS.md: RAG Services section
Using Knowledge Base Without RAG (No Retriever Mode)β
You can create and use knowledge bases even without configuring a retriever. In this mode, the AI uses exploration tools instead of semantic search.
What Works Without RAGβ
- β Document upload and storage
- β Document viewing and editing
- β
AI can browse documents using
kb_ls(list) andkb_head(read) tools - β Manual document exploration by AI
- β Knowledge base chat in notebook mode
What Requires RAG Configurationβ
- β Semantic search (
knowledge_base_searchtool) - β Vector similarity retrieval
- β Automatic chunk-based retrieval
- β Hybrid search (vector + keyword)
When to Use No-RAG Modeβ
Consider using knowledge bases without RAG when:
- No Vector Database Available: You don't have Elasticsearch or other vector database set up
- Small Knowledge Base: Your knowledge base is small enough for AI to read through documents
- Testing: You want to test knowledge base functionality without RAG infrastructure
- Cost Optimization: You want to avoid embedding model API costs
AI Behavior in No-RAG Modeβ
When you chat with an AI that has access to a knowledge base without RAG:
- Document Discovery: AI uses
kb_lsto list available documents with summaries - Content Selection: AI reviews document summaries to identify relevant ones
- Content Reading: AI uses
kb_headto read document content (with pagination for large files) - Answer Generation: AI answers based on the content it has read
Example Workflowβ
User: What does the API documentation say about authentication?
AI: Let me explore the knowledge base to find relevant information.
[Uses kb_ls to list documents]
Found 5 documents:
- api-guide.md (15KB) - API usage guide with authentication section
- setup.md (8KB) - Initial setup instructions
- ...
[Uses kb_head to read api-guide.md]
Reading authentication section from api-guide.md...
Based on the API documentation, authentication uses JWT tokens...
Performance Considerationsβ
This approach is less efficient than RAG retrieval:
| Aspect | RAG Mode | No-RAG Mode |
|---|---|---|
| Search Speed | Fast (vector similarity) | Slower (sequential reading) |
| Token Usage | Lower (relevant chunks only) | Higher (may read full documents) |
| Accuracy | Semantic understanding | Depends on document summaries |
| Best For | Large knowledge bases | Small knowledge bases (<50 docs) |
Choosing The RAG Mode On Creationβ
When creating a knowledge base, RAG Retrieval in Advanced Settings provides two modes:
| Mode | Behavior | Best For |
|---|---|---|
| Auto | The system selects an available retriever and embedding model while preserving retrieval parameters you changed | Recommended default |
| No RAG | Creates a no-RAG knowledge base without writing retrieval configuration | No vector database, small knowledge bases, or testing |
If you choose automatic configuration but the current environment has no usable default retriever or embedding model, the knowledge base can still be created, but no RAG retrieval configuration is written. Uploaded documents are stored, and AI will use knowledge exploration tools to read the content in later conversations.
To create a no-RAG knowledge base:
- Create Knowledge Base: Open advanced settings in the create dialog and set RAG Retrieval to No RAG
- Upload Documents: Documents are stored but not indexed for RAG
- Start Chatting: AI will automatically use exploration tools
Note: After configuring a retriever and embedding model, newly created knowledge bases will automatically receive RAG retrieval configuration.
Related Documentationβ
Feedbackβ
Since RAG functionality is experimental, we welcome your feedback:
- Report issues on GitHub
- Suggest improvements
- Share your use cases
Note: This feature is under active development. Check the changelog for updates.