* Added nbconvert needed for rag indexing jupyter notebooks * Allow rag service image to be configured * [pre-commit.ci lite] apply automatic fixes --------- Co-authored-by: pre-commit-ci-lite[bot] <117423508+pre-commit-ci-lite[bot]@users.noreply.github.com>
RAG Service Configuration
This document describes how to configure the RAG service, including setting up Language Model (LLM) and Embedding providers.
Provider Support Matrix
The following table shows which model types are supported by each provider:
| Provider | LLM Support | Embedding Support |
|---|---|---|
| dashscope | Yes | Yes |
| ollama | Yes | Yes |
| openai | Yes | Yes |
| openrouter | Yes | No |
LLM Provider Configuration
The llm section in the configuration file is used to configure the Language Model (LLM) used by the RAG service.
Here are the configuration examples for each supported LLM provider:
OpenAI LLM Configuration
llm = { -- Configuration for the Language Model (LLM) used by the RAG service
provider = "openai", -- The LLM provider ("openai")
endpoint = "https://api.openai.com/v1", -- The LLM API endpoint
api_key = "OPENAI_API_KEY", -- The environment variable name for the LLM API key
model = "gpt-4o-mini", -- The LLM model name (e.g., "gpt-4o-mini", "gpt-3.5-turbo")
extra = {-- Extra configuration options for the LLM (optional)
temperature = 0.7, -- Controls the randomness of the output. Lower values make it more deterministic.
max_tokens = 512, -- The maximum number of tokens to generate in the completion.
-- system_prompt = "You are a helpful assistant.", -- A system prompt to guide the model's behavior.
-- timeout = 120, -- Request timeout in seconds.
},
},
DashScope LLM Configuration
llm = { -- Configuration for the Language Model (LLM) used by the RAG service
provider = "dashscope", -- The LLM provider ("dashscope")
endpoint = "", -- The LLM API endpoint (DashScope typically uses default or environment variables)
api_key = "DASHSCOPE_API_KEY", -- The environment variable name for the LLM API key
model = "qwen-plus", -- The LLM model name (e.g., "qwen-plus", "qwen-max")
extra = nil, -- Extra configuration options for the LLM (optional)
},
Ollama LLM Configuration
llm = { -- Configuration for the Language Model (LLM) used by the RAG service
provider = "ollama", -- The LLM provider ("ollama")
endpoint = "http://localhost:11434", -- The LLM API endpoint for Ollama
api_key = "", -- Ollama typically does not require an API key
model = "llama2", -- The LLM model name (e.g., "llama2", "mistral")
extra = nil, -- Extra configuration options for the LLM (optional) Kristin", -- Extra configuration options for the LLM (optional)
},
OpenRouter LLM Configuration
llm = { -- Configuration for the Language Model (LLM) used by the RAG service
provider = "openrouter", -- The LLM provider ("openrouter")
endpoint = "https://openrouter.ai/api/v1", -- The LLM API endpoint for OpenRouter
api_key = "OPENROUTER_API_KEY", -- The environment variable name for the LLM API key
model = "openai/gpt-4o-mini", -- The LLM model name (e.g., "openai/gpt-4o-mini", "mistralai/mistral-7b-instruct")
extra = nil, -- Extra configuration options for the LLM (optional)
},
Embedding Provider Configuration
The embedding section in the configuration file is used to configure the Embedding Model used by the RAG service.
Here are the configuration examples for each supported Embedding provider:
OpenAI Embedding Configuration
embed = { -- Configuration for the Embedding Model used by the RAG service
provider = "openai", -- The Embedding provider ("openai")
endpoint = "https://api.openai.com/v1", -- The Embedding API endpoint
api_key = "OPENAI_API_KEY", -- The environment variable name for the Embedding API key
model = "text-embedding-3-large", -- The Embedding model name (e.g., "text-embedding-3-small", "text-embedding-3-large")
extra = {-- Extra configuration options for the Embedding model (optional)
dimensions = nil,
},
},
DashScope Embedding Configuration
embed = { -- Configuration for the Embedding Model used by the RAG service
provider = "dashscope", -- The Embedding provider ("dashscope")
endpoint = "", -- The Embedding API endpoint (DashScope typically uses default or environment variables)
api_key = "DASHSCOPE_API_KEY", -- The environment variable name for the Embedding API key
model = "text-embedding-v3", -- The Embedding model name (e.g., "text-embedding-v2")
extra = { -- Extra configuration options for the Embedding model (optional)
embed_batch_size = 10,
},
},
Ollama Embedding Configuration
embed = { -- Configuration for the Embedding Model used by the RAG service
provider = "ollama", -- The Embedding provider ("ollama")
endpoint = "http://localhost:11434", -- The Embedding API endpoint for Ollama
api_key = "", -- Ollama typically does not require an API key
model = "nomic-embed-text", -- The Embedding model name (e.g., "nomic-embed-text")
extra = { -- Extra configuration options for the Embedding model (optional)
embed_batch_size = 10,
},
},