API Reference: Running Models & Version

List Running Models — GET /api/ps

Shows models currently loaded in memory with their resource usage.

curl http://localhost:11434/api/ps

Response

{
  "models": [
    {
      "name": "qwen3:1.7b",
      "model": "qwen3:1.7b",
      "size": 2800000000,
      "digest": "sha256:a2af6cc3eb7f...",
      "details": {
        "format": "gguf",
        "family": "qwen3",
        "families": ["qwen3"],
        "parameter_size": "1.7B",
        "quantization_level": "Q4_K_M"
      },
      "expires_at": "2025-10-17T16:47:07Z",
      "size_vram": 2500000000,
      "context_length": 4096
    }
  ]
}

Fields

FieldTypeDescription namestringModel name sizeintegerTotal memory used (bytes) digeststringSHA256 digest detailsobjectFormat, family, parameter_size, quantization expires_atstringWhen the model will be unloaded (ISO 8601) size_vramintegerVRAM usage in bytes context_lengthintegerActive context length

mindX Usage

Monitor which models are loaded to prevent OOM on 4GB VPS:

import aiohttp
async def get_running_models(base_url="http://localhost:11434"):
    async with aiohttp.ClientSession() as session:
        async with session.get(f"{base_url}/api/ps") as resp:
            data = await resp.json()
            return data.get("models", [])
Check if we need to unload before loading a new model
running = await get_running_models()
total_mem = sum(m["size"] for m in running)
if total_mem > 3_000_000_000:  # 3GB threshold on 4GB VPS
    # Unload least recently used
    for model in running:
        await unload_model(model["name"])

Version — GET /api/version

curl http://localhost:11434/api/version

Response:

{"version": "0.12.6"}

CLI Equivalents

# List running models
ollama ps
Stop/unload a model
ollama stop qwen3:1.7b

Output of `ollama ps`:

NAME          ID            SIZE    PROCESSOR   UNTIL
qwen3:1.7b   abc123def456  1.4 GB  100% CPU    4 minutes from now

The PROCESSOR column shows:

100% GPU — entirely on GPU

100% CPU — entirely in system memory

48%/52% CPU/GPU — split across both

All Documents Document Index The Book of mindX Improvement Journal API Reference