Configuring Retrievers

Retrievers are configurations for RAG (Retrieval-Augmented Generation) functionality in Wegent. They define how documents are indexed, stored, and retrieved.

Prerequisites

Wegent platform installed and running
Elasticsearch service enabled (optional, only needed for RAG features)
```
docker compose --profile rag up -d
```

What is a Retriever?

A Retriever is a CRD (Custom Resource Definition) that configures:

Storage Backend: Vector database connection (Elasticsearch, Qdrant)
Index Strategy: How documents are organized in the database
Retrieval Methods: Search modes (vector, keyword, hybrid)
Embedding Configuration: How text is converted to vectors

Creating a Retriever

Via Web UI

Navigate to Settings → Retrievers
Click Add Retriever
Fill in the configuration:
- Name: Unique identifier (e.g., my-es-retriever)
- Display Name: Human-readable name
- Storage Type: Select elasticsearch or qdrant
- URL: Storage backend URL (e.g., http://elasticsearch:9200)
- Authentication: Username/password or API key (optional)
- Index Strategy: Choose indexing mode
- Retrieval Methods: Enable vector, keyword, or hybrid search
Click Test Connection to verify settings
Click Create to save

Via API

POST /api/retrievers
Content-Type: application/json

{
  "apiVersion": "agent.wecode.io/v1",
  "kind": "Retriever",
  "metadata": {
    "name": "my-es-retriever",
    "namespace": "default",
    "displayName": "My Elasticsearch Retriever"
  },
  "spec": {
    "storageConfig": {
      "type": "elasticsearch",
      "url": "http://elasticsearch:9200",
      "username": "elastic",
      "password": "password",
      "indexStrategy": {
        "mode": "per_user",
        "prefix": "wegent"
      }
    },
    "retrievalMethods": {
      "vector": {
        "enabled": true,
        "defaultWeight": 0.7
      },
      "keyword": {
        "enabled": true,
        "defaultWeight": 0.3
      },
      "hybrid": {
        "enabled": true
      }
    },
    "description": "Elasticsearch retriever for RAG"
  }
}

Index Strategies

Choose an index strategy based on your use case:

Strategy	Description	Best For
per_user	One index per user	Elasticsearch deployments, user-level isolation
per_dataset	One index per knowledge base	Multi-tenant scenarios, dataset isolation
fixed	Single fixed index	Small datasets, simple setup
rolling	Hash-based sharding	Large datasets, load distribution

Recommended: per_user Mode

For Elasticsearch, we recommend using per_user mode:

{
  "indexStrategy": {
    "mode": "per_user",
    "prefix": "wegent"
  }
}

This creates indices like wegent_user_123, providing better performance and isolation.

Retrieval Methods

Vector Search (Semantic)

Pure vector similarity search for semantic understanding:

Parameter	Description	Default
`retrieval_mode`	Retrieval mode	`vector`
`top_k`	Number of results	5
`score_threshold`	Relevance threshold	0.7

Use cases: Concept matching, understanding questions, semantic search

Keyword Search

Traditional BM25 keyword matching:

Parameter	Description	Default
`retrieval_mode`	Retrieval mode	`keyword`
`top_k`	Number of results	5

Use cases: Exact term matching, code search, API names

Hybrid Search (Vector + Keyword)

Combines vector similarity with BM25 keyword matching:

Parameter	Description	Default
`retrieval_mode`	Retrieval mode	`hybrid`
`vector_weight`	Vector weight	0.7
`keyword_weight`	Keyword weight	0.3
`top_k`	Number of results	5
`score_threshold`	Relevance threshold	0.7

Weight recommendations:

Conceptual queries (0.8/0.2): Understanding, explanations
Balanced (0.7/0.3): General purpose (default)
Precise matching (0.3/0.7): Code search, API names, exact terms

Retrieval Test

Before saving retrieval configuration, you can use the retrieval test feature to verify effectiveness.

How to Use

Go to Knowledge Base Retrieval Settings
Configure retrieval parameters (mode, top_k, threshold, etc.)
Enter a test query in the Retrieval Test area
Click Test button
Review returned document chunks and relevance scores
Adjust parameters based on results
Click Save when satisfied

Test Recommendations

Test Type	Suggested Query
Conceptual	Use descriptive questions like "What is..."
Precise	Use specific terms or names
Boundary	Use vague or unrelated queries

Retrieval test helps you optimize retrieval configuration before actual use.

Using Retrievers with RAG

1. Upload and Index Documents

POST /api/rag/documents/upload
Content-Type: multipart/form-data

- knowledge_id: "kb_001"
- retriever_name: "my-es-retriever"
- retriever_namespace: "default"
- file: <document.pdf>
- embedding_config: {
    "provider": "openai",
    "model": "text-embedding-3-small",
    "api_key": "sk-..."
  }

Supported file types: MD, PDF, TXT, DOCX, and code files

2. Retrieve Relevant Chunks

POST /api/rag/retrieve
Content-Type: application/json

{
  "query": "How do I configure a bot?",
  "knowledge_id": "kb_001",
  "retriever_ref": {
    "name": "my-es-retriever",
    "namespace": "default"
  },
  "embedding_config": {
    "provider": "openai",
    "model": "text-embedding-3-small",
    "api_key": "sk-..."
  },
  "top_k": 5,
  "score_threshold": 0.7,
  "retrieval_mode": "hybrid",
  "hybrid_weights": {
    "vector_weight": 0.7,
    "keyword_weight": 0.3
  }
}

3. Manage Documents

# List documents
GET /api/rag/documents?knowledge_id=kb_001&retriever_name=my-es-retriever&page=1&page_size=20

# Get document details
GET /api/rag/documents/{doc_ref}?knowledge_id=kb_001&retriever_name=my-es-retriever

# Delete document
DELETE /api/rag/documents/{doc_ref}?knowledge_id=kb_001&retriever_name=my-es-retriever

Embedding Providers

OpenAI

{
  "provider": "openai",
  "model": "text-embedding-3-small",
  "api_key": "sk-...",
  "base_url": "https://api.openai.com/v1"  // Optional
}

Custom API (OpenAI-compatible)

{
  "provider": "custom",
  "model": "your-model-name",
  "api_key": "your-api-key",
  "base_url": "https://your-api-endpoint.com/v1"
}

Resource Scopes

Retrievers support three scopes:

Scope	Description	Access
Personal	Your private retrievers	Only you
Group	Shared within a group	Group members
Public	System-provided retrievers	All users

Best Practices

1. Index Strategy Selection

Use per_user for Elasticsearch (recommended)
Use per_dataset for multi-tenant scenarios with dataset isolation
Avoid fixed for production (only suitable for small, single-tenant deployments)

2. Retrieval Mode Selection

Vector mode: Semantic understanding, concept matching
Hybrid mode: Balanced semantic and exact matching (recommended for most use cases)
Adjust weights: Based on query type (conceptual vs. precise)

3. Security

Store credentials securely (use environment variables or secrets management)
Use API keys instead of username/password when possible
Restrict access using namespaces and groups

4. Performance

Choose appropriate index strategy based on dataset size
Monitor storage backend performance
Use per_user mode for Elasticsearch to avoid index explosion
Set appropriate top_k and score_threshold values

5. Document Management

Use meaningful knowledge_id values for organization
Regularly clean up unused documents
Monitor storage usage

Troubleshooting

Connection Failed

Problem: Cannot connect to Elasticsearch

Solutions:

Verify Elasticsearch is running: docker ps | grep elasticsearch
Check URL is correct: http://elasticsearch:9200 (internal) or http://localhost:9200 (external)
Test connection: Use the Test Connection button in the UI
Check credentials: Verify username/password or API key

Indexing Failed

Problem: Document upload fails

Solutions:

Check file format is supported (MD, PDF, TXT, DOCX, code files)
Verify embedding provider credentials
Check Elasticsearch storage capacity
Review backend logs for detailed error messages

Low Retrieval Quality

Problem: Retrieved chunks are not relevant

Solutions:

Try hybrid mode instead of pure vector mode
Adjust hybrid weights based on query type
Lower score_threshold to get more results
Use a better embedding model (e.g., text-embedding-3-large)
Improve document chunking (automatic semantic chunking is used)

API Reference

For complete API documentation, see:

Backend API docs: http://localhost:8000/api/docs
AGENTS.md: RAG Services section

Using Knowledge Base Without RAG (No Retriever Mode)

You can create and use knowledge bases even without configuring a retriever. In this mode, the AI uses exploration tools instead of semantic search.

What Works Without RAG

✅ Document upload and storage
✅ Document viewing and editing
✅ AI can browse documents using kb_ls (list) and kb_head (read) tools
✅ Manual document exploration by AI
✅ Knowledge base chat in notebook mode

What Requires RAG Configuration

❌ Semantic search (knowledge_base_search tool)
❌ Vector similarity retrieval
❌ Automatic chunk-based retrieval
❌ Hybrid search (vector + keyword)

When to Use No-RAG Mode

Consider using knowledge bases without RAG when:

No Vector Database Available: You don't have Elasticsearch or other vector database set up
Small Knowledge Base: Your knowledge base is small enough for AI to read through documents
Testing: You want to test knowledge base functionality without RAG infrastructure
Cost Optimization: You want to avoid embedding model API costs

AI Behavior in No-RAG Mode

When you chat with an AI that has access to a knowledge base without RAG:

Document Discovery: AI uses kb_ls to list available documents with summaries
Content Selection: AI reviews document summaries to identify relevant ones
Content Reading: AI uses kb_head to read document content (with pagination for large files)
Answer Generation: AI answers based on the content it has read

Example Workflow

User: What does the API documentation say about authentication?

AI: Let me explore the knowledge base to find relevant information.

[Uses kb_ls to list documents]
Found 5 documents:
- api-guide.md (15KB) - API usage guide with authentication section
- setup.md (8KB) - Initial setup instructions
- ...

[Uses kb_head to read api-guide.md]
Reading authentication section from api-guide.md...

Based on the API documentation, authentication uses JWT tokens...

Performance Considerations

This approach is less efficient than RAG retrieval:

Aspect	RAG Mode	No-RAG Mode
Search Speed	Fast (vector similarity)	Slower (sequential reading)
Token Usage	Lower (relevant chunks only)	Higher (may read full documents)
Accuracy	Semantic understanding	Depends on document summaries
Best For	Large knowledge bases	Small knowledge bases (<50 docs)

Setting Up No-RAG Mode

Create Knowledge Base: In the create dialog, skip the retrieval configuration section
Upload Documents: Documents are stored but not indexed for RAG
Start Chatting: AI will automatically use exploration tools

Note: You can always add RAG configuration later by editing the knowledge base settings after configuring a retriever.

Feedback

Since RAG functionality is experimental, we welcome your feedback:

Report issues on GitHub
Suggest improvements
Share your use cases

Note: This feature is under active development. Check the changelog for updates.

Prerequisites​

What is a Retriever?​

Creating a Retriever​

Via Web UI​

Via API​

Index Strategies​

Recommended: per_user Mode​

Retrieval Methods​

Vector Search (Semantic)​

Keyword Search​

Hybrid Search (Vector + Keyword)​

Retrieval Test​

How to Use​

Test Recommendations​

Using Retrievers with RAG​

1. Upload and Index Documents​

2. Retrieve Relevant Chunks​

3. Manage Documents​

Embedding Providers​

OpenAI​

Custom API (OpenAI-compatible)​

Resource Scopes​

Best Practices​

1. Index Strategy Selection​

2. Retrieval Mode Selection​

3. Security​

4. Performance​

5. Document Management​

Troubleshooting​

Connection Failed​

Indexing Failed​

Low Retrieval Quality​

API Reference​

Using Knowledge Base Without RAG (No Retriever Mode)​

What Works Without RAG​

What Requires RAG Configuration​

When to Use No-RAG Mode​

AI Behavior in No-RAG Mode​

Example Workflow​

Performance Considerations​

Setting Up No-RAG Mode​

Related Documentation​

Feedback​

Prerequisites

What is a Retriever?

Creating a Retriever

Via Web UI

Via API

Index Strategies

Recommended: per_user Mode

Retrieval Methods

Vector Search (Semantic)

Keyword Search

Hybrid Search (Vector + Keyword)

Retrieval Test

How to Use

Test Recommendations

Using Retrievers with RAG

1. Upload and Index Documents

2. Retrieve Relevant Chunks

3. Manage Documents

Embedding Providers

OpenAI

Custom API (OpenAI-compatible)

Resource Scopes

Best Practices

1. Index Strategy Selection

2. Retrieval Mode Selection

3. Security

4. Performance

5. Document Management

Troubleshooting

Connection Failed

Indexing Failed

Low Retrieval Quality

API Reference

Using Knowledge Base Without RAG (No Retriever Mode)

What Works Without RAG

What Requires RAG Configuration

When to Use No-RAG Mode

AI Behavior in No-RAG Mode

Example Workflow

Performance Considerations

Setting Up No-RAG Mode

Related Documentation

Feedback