Generate vector embeddings for RAGE/semantic search and pgvector storage.
POST http://localhost:11434/api/embed
POST https://ollama.com/api/embed # Cloud (requires OLLAMA_API_KEY)
modelmxbai-embed-large, nomic-embed-text)inputtruncatetruefalse = error on overflowdimensionskeep_alive"5m"optionsmodelembeddingstotal_durationload_durationprompt_eval_countcurl http://localhost:11434/api/embed -d '{
"model": "mxbai-embed-large",
"input": "The quick brown fox jumps over the lazy dog."
}'
Response:
{
"model": "mxbai-embed-large",
"embeddings": [[0.010071, -0.001759, 0.050072, ...]],
"total_duration": 14143917,
"load_duration": 1019500,
"prompt_eval_count": 8
}
curl http://localhost:11434/api/embed -d '{
"model": "mxbai-embed-large",
"input": [
"First document to embed",
"Second document to embed",
"Third document to embed"
]
}'
Returns embeddings array with one vector per input text.
curl http://localhost:11434/api/embed -d '{
"model": "mxbai-embed-large",
"input": "Generate embeddings for this text",
"dimensions": 128
}'
curl http://localhost:11434/api/embed -d '{
"model": "mxbai-embed-large",
"input": "Very long text that might exceed context...",
"truncate": false
}'
mxbai-embed-largenomic-embed-textembeddinggemmaqwen3-embeddingall-minilmmindX currently uses mxbai-embed-large and nomic-embed-text for RAGE (not RAG) semantic search with pgvector.
# Direct embedding via aiohttp (extend OllamaAPI)
import aiohttp, json
async def embed_texts(texts: list[str], model: str = "mxbai-embed-large") -> list[list[float]]:
"""Generate embeddings via Ollama for pgvector storage."""
async with aiohttp.ClientSession() as session:
payload = {"model": model, "input": texts}
async with session.post(
"http://localhost:11434/api/embed",
json=payload,
timeout=aiohttp.ClientTimeout(total=60)
) as resp:
data = await resp.json()
return data["embeddings"]
Usage with pgvector
embeddings = await embed_texts(["mindX autonomous improvement", "BDI reasoning engine"])
Store in pgvector: INSERT INTO memories (content, embedding) VALUES ($1, $2)