Supported Providers
Anthropic
Default model:
claude-sonnet-4-20250514. High quality for both compression and knowledge graph extraction.OpenAI
Default model:
gpt-4o-mini. Cost-effective for continuous background compression. Override with OPENAI_MODEL=gpt-4o.Google Gemini
Default model:
gemini-2.5-flash. Also auto-enables Gemini embeddings (gemini-embedding-001). Supports a free tier.OpenRouter
Default model:
anthropic/claude-sonnet-4-20250514. Routes to any model in the OpenRouter catalog — useful for cost optimization.MiniMax
Default model:
MiniMax-M2.7. Anthropic-compatible API. Good alternative for high-volume compression workloads.Local / Ollama
Uses any OpenAI-compatible server. Zero API cost, fully offline. Works with Ollama, LM Studio, vLLM, and llama.cpp.
Setup for Each Provider
Add the relevant key to~/.agentmemory/.env, then restart Agent Memory.
- Anthropic
- OpenAI
- Gemini
- OpenRouter
- Local / Ollama
Embedding Providers
Embeddings are configured separately from the LLM provider. Agent Memory auto-detects the embedding provider from your available keys, or you can setEMBEDDING_PROVIDER explicitly.
| Provider | Variable | Model | Notes |
|---|---|---|---|
| Local (default) | EMBEDDING_PROVIDER=local | all-MiniLM-L6-v2 (384-dim) | Free, offline, no key required. Ships bundled via @xenova/transformers. |
| Voyage AI | VOYAGE_API_KEY=pa-... | voyage-code-3 | Recommended for code projects. Optimized for code semantics and retrieval. |
| OpenAI | OPENAI_API_KEY=sk-... | text-embedding-3-small (1536-dim) | Enabled automatically when OPENAI_API_KEY is set. Override model with OPENAI_EMBEDDING_MODEL. |
| Gemini | GEMINI_API_KEY=... | gemini-embedding-001 | Enabled automatically when GEMINI_API_KEY is set. Supports 100+ languages. |
| Cohere | COHERE_API_KEY=... | embed-english-v3.0 | General-purpose embeddings with a free trial tier. |
| OpenRouter | OPENROUTER_API_KEY=... | configurable | Set OPENROUTER_EMBEDDING_MODEL to select the model. |
Provider Auto-Detection
Agent Memory checks for API keys in a fixed priority order and activates the first one it finds. You don’t need to setEMBEDDING_PROVIDER or any provider name explicitly — just set your API key.
Detection order for LLM providers:
Fallback Chain
If your primary LLM provider returns an error (for example, a rate limit or temporary outage), Agent Memory can automatically retry with a secondary provider:Recommended Setup for Code Projects
For the best recall quality on code-heavy projects:voyage-code-3 is specifically trained on code and significantly outperforms general-purpose embedding models on code retrieval tasks. Pair it with any LLM provider for consolidation and graph extraction.