Models
Supported LLM providers and how to configure them in OpenAgent.
Models
OpenAgent integrates with 30+ language model providers through a unified interface. You can mix and match providers across different agents without changing any agent logic.
Supported Providers
Cloud Providers
| Provider | Models | Notes |
|---|---|---|
| OpenAI | GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo, o1, o3 | DALL-E 3 for image gen |
| Anthropic | Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku | Via Anthropic API |
| Gemini 2.0 Flash, Gemini 1.5 Pro, Gemini 1.5 Flash | Vertex AI supported | |
| Azure OpenAI | All OpenAI models | Hosted in your Azure subscription |
| Amazon Bedrock | Claude, Llama, Titan | AWS region support |
| DeepSeek | DeepSeek-V3, DeepSeek-R1 | Strong coding & reasoning |
| Mistral | Mistral Large, Mistral 7B, Mixtral 8x7B | Open weights options |
| Groq | Llama 3.x, Mixtral, Gemma | Ultra-fast inference |
| OpenRouter | 200+ models | Model aggregation gateway |
Chinese Providers
| Provider | Models |
|---|---|
| Alibaba Qwen | Qwen-Max, Qwen-Plus, Qwen-Turbo |
| Baidu Ernie | ERNIE 4.0, ERNIE 3.5 |
| Zhipu ChatGLM | GLM-4, GLM-4V |
| Baichuan | Baichuan2-Turbo |
| Moonshot | Moonshot-v1-8k/32k/128k |
| MiniMax | MiniMax-Text-01 |
| StepFun | Step-1, Step-1V |
| Hunyuan | Hunyuan-Pro, Hunyuan-Standard |
| Doubao | Doubao-Pro |
Local / Self-Hosted
| Provider | Notes |
|---|---|
| Ollama | Run Llama, Mistral, Phi, Gemma locally |
| LM Studio | OpenAI-compatible local API |
| Hugging Face | Inference API + local models |
| LiteLLM | Proxy any model via unified API |
Configuring a Provider
Via the Dashboard
- Go to Settings → Model Providers
- Click Add Provider
- Select your provider from the list
- Enter the required credentials (API key, base URL, etc.)
- Click Test Connection, then Save
Configuration Examples
{
"provider": "openai",
"api_key": "sk-...",
"models": ["gpt-4o", "gpt-4o-mini", "gpt-3.5-turbo"]
}{
"provider": "anthropic",
"api_key": "sk-ant-...",
"models": ["claude-3-5-sonnet-20241022", "claude-3-haiku-20240307"]
}{
"provider": "ollama",
"base_url": "http://localhost:11434",
"models": ["llama3.2", "mistral", "phi4", "gemma2"]
}No API key required. Ollama must be running locally with models already pulled.
{
"provider": "azure_openai",
"api_key": "...",
"endpoint": "https://your-resource.openai.azure.com",
"api_version": "2024-02-01",
"deployment_name": "gpt-4o"
}Embedding Models
OpenAgent uses embedding models separately from chat models. Embeddings power the knowledge base search.
| Provider | Embedding Models |
|---|---|
| OpenAI | text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002 |
| Cohere | embed-english-v3.0, embed-multilingual-v3.0 |
| Alibaba Qwen | text-embedding-v1, text-embedding-v2 |
| Ollama | nomic-embed-text, mxbai-embed-large |
| Jina | jina-embeddings-v3 |
You can use a different provider for embeddings than for chat. For example, use OpenAI embeddings with a local Ollama chat model.
Model Selection Tips
For general-purpose assistants:
gpt-4oorclaude-3-5-sonnet— best balance of capability and speedgpt-4o-miniorclaude-3-haiku— faster and cheaper for high-volume use
For document Q&A / RAG:
- Any frontier model works well; larger context windows help
gemini-1.5-pro(1M token context) for very large document sets
For code generation and reasoning:
deepseek-r1oro3-minifor complex reasoning tasksclaude-3-5-sonnetfor code generation
For cost-sensitive or offline deployments:
- Ollama with
llama3.2ormistralfor fully local, free inference gpt-4o-minifor the cheapest capable cloud model
Context Length Management
OpenAgent automatically manages context windows. When conversation history would exceed the model's context limit, it applies a sliding window strategy: the oldest turns are dropped while keeping the system prompt and recent history.
You can configure the context window size per agent to control memory vs. cost tradeoffs.