Files
avante.nvim/py/rag-service/README.md
doodleEsc 2dd4c04088 feat: Enhanced Model Provider Support and Configuration Flexibility For Rag Service (#2056)
Co-authored-by: doodleEsc <cokie@foxmail.com>
Co-authored-by: pre-commit-ci-lite[bot] <117423508+pre-commit-ci-lite[bot]@users.noreply.github.com>
2025-06-06 23:07:07 +08:00

136 lines
6.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# RAG Service Configuration
This document describes how to configure the RAG service, including setting up Language Model (LLM) and Embedding providers.
## Provider Support Matrix
The following table shows which model types are supported by each provider:
| Provider | LLM Support | Embedding Support |
| ---------- | ----------- | ----------------- |
| dashscope | Yes | Yes |
| ollama | Yes | Yes |
| openai | Yes | Yes |
| openrouter | Yes | No |
## LLM Provider Configuration
The `llm` section in the configuration file is used to configure the Language Model (LLM) used by the RAG service.
Here are the configuration examples for each supported LLM provider:
### OpenAI LLM Configuration
[See more configurations](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/llms/llama-index-llms-openai/llama_index/llms/openai/base.py#L130)
```lua
llm = { -- Configuration for the Language Model (LLM) used by the RAG service
provider = "openai", -- The LLM provider ("openai")
endpoint = "https://api.openai.com/v1", -- The LLM API endpoint
api_key = "OPENAI_API_KEY", -- The environment variable name for the LLM API key
model = "gpt-4o-mini", -- The LLM model name (e.g., "gpt-4o-mini", "gpt-3.5-turbo")
extra = {-- Extra configuration options for the LLM (optional)
temperature = 0.7, -- Controls the randomness of the output. Lower values make it more deterministic.
max_tokens = 512, -- The maximum number of tokens to generate in the completion.
-- system_prompt = "You are a helpful assistant.", -- A system prompt to guide the model's behavior.
-- timeout = 120, -- Request timeout in seconds.
},
},
```
### DashScope LLM Configuration
[See more configurations](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/llms/llama-index-llms-dashscope/llama_index/llms/dashscope/base.py#L155)
```lua
llm = { -- Configuration for the Language Model (LLM) used by the RAG service
provider = "dashscope", -- The LLM provider ("dashscope")
endpoint = "", -- The LLM API endpoint (DashScope typically uses default or environment variables)
api_key = "DASHSCOPE_API_KEY", -- The environment variable name for the LLM API key
model = "qwen-plus", -- The LLM model name (e.g., "qwen-plus", "qwen-max")
extra = nil, -- Extra configuration options for the LLM (optional)
},
```
### Ollama LLM Configuration
[See more configurations](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/llms/llama-index-llms-ollama/llama_index/llms/ollama/base.py#L65)
```lua
llm = { -- Configuration for the Language Model (LLM) used by the RAG service
provider = "ollama", -- The LLM provider ("ollama")
endpoint = "http://localhost:11434", -- The LLM API endpoint for Ollama
api_key = "", -- Ollama typically does not require an API key
model = "llama2", -- The LLM model name (e.g., "llama2", "mistral")
extra = nil, -- Extra configuration options for the LLM (optional) Kristin", -- Extra configuration options for the LLM (optional)
},
```
### OpenRouter LLM Configuration
[See more configurations](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/llms/llama-index-llms-openrouter/llama_index/llms/openrouter/base.py#L17)
```lua
llm = { -- Configuration for the Language Model (LLM) used by the RAG service
provider = "openrouter", -- The LLM provider ("openrouter")
endpoint = "https://openrouter.ai/api/v1", -- The LLM API endpoint for OpenRouter
api_key = "OPENROUTER_API_KEY", -- The environment variable name for the LLM API key
model = "openai/gpt-4o-mini", -- The LLM model name (e.g., "openai/gpt-4o-mini", "mistralai/mistral-7b-instruct")
extra = nil, -- Extra configuration options for the LLM (optional)
},
```
## Embedding Provider Configuration
The `embedding` section in the configuration file is used to configure the Embedding Model used by the RAG service.
Here are the configuration examples for each supported Embedding provider:
### OpenAI Embedding Configuration
[See more configurations](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/embeddings/llama-index-embeddings-openai/llama_index/embeddings/openai/base.py#L214)
```lua
embed = { -- Configuration for the Embedding Model used by the RAG service
provider = "openai", -- The Embedding provider ("openai")
endpoint = "https://api.openai.com/v1", -- The Embedding API endpoint
api_key = "OPENAI_API_KEY", -- The environment variable name for the Embedding API key
model = "text-embedding-3-large", -- The Embedding model name (e.g., "text-embedding-3-small", "text-embedding-3-large")
extra = {-- Extra configuration options for the Embedding model (optional)
dimensions = nil,
},
},
```
### DashScope Embedding Configuration
[See more configurations](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/embeddings/llama-index-embeddings-dashscope/llama_index/embeddings/dashscope/base.py#L156)
```lua
embed = { -- Configuration for the Embedding Model used by the RAG service
provider = "dashscope", -- The Embedding provider ("dashscope")
endpoint = "", -- The Embedding API endpoint (DashScope typically uses default or environment variables)
api_key = "DASHSCOPE_API_KEY", -- The environment variable name for the Embedding API key
model = "text-embedding-v3", -- The Embedding model name (e.g., "text-embedding-v2")
extra = { -- Extra configuration options for the Embedding model (optional)
embed_batch_size = 10,
},
},
```
### Ollama Embedding Configuration
[See more configurations](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/embeddings/llama-index-embeddings-ollama/llama_index/embeddings/ollama/base.py#L12)
```lua
embed = { -- Configuration for the Embedding Model used by the RAG service
provider = "ollama", -- The Embedding provider ("ollama")
endpoint = "http://localhost:11434", -- The Embedding API endpoint for Ollama
api_key = "", -- Ollama typically does not require an API key
model = "nomic-embed-text", -- The Embedding model name (e.g., "nomic-embed-text")
extra = { -- Extra configuration options for the Embedding model (optional)
embed_batch_size = 10
},
},
```