monitoring_rate_control.md · 5.6 KB

Monitoring and Rate Control (Both Directions)

Whether mindX is ingesting (receiving data from clients), providing inference (calling Ollama/LLMs), or services (orchestration, memory, tools), monitoring and rate control are essential in both directions. This document defines actual network and data metrics in scientific form (SI or standard units) and where they apply.


1. Both directions

DirectionRoleMonitoringRate control
InboundClients → mindX (ingestion, agent/call, ollama/ingest)Request latency, payload size, throughputPer-client or global req/s, req/min
OutboundmindX → Ollama / LLM providers (inference)Latency, tokens, throughput, errorsRPM, RPH, token bucket (see llm/rate_limiter.py)

Both directions must be measured and, where configured, limited so that ingestion, inference, and services stay within capacity and quotas.


2. Scientific network and data metrics

All metrics use explicit units. Prefer SI or widely used standards.

2.1 Time

SymbolUnitDescription
\(t\)s (second)Wall-clock time
\(T_{\mathrm{lat}}\)s or msLatency (request start → response end)
\(T_{\mathrm{wait}}\)msWait time in rate limiter before sending

2.2 Data volume

SymbolUnitDescription
\(B_{\mathrm{in}}\)byteRequest body size (payload in)
\(B_{\mathrm{out}}\)byteResponse body size (payload out)
\(N_{\mathrm{tok}}\)1 (dimensionless)Token count (input + output)

2.3 Rate (throughput)

QuantityUnitDescription
Request rate (inbound)req/s, req/minIncoming API requests per unit time
Request rate (outbound)req/min (RPM), req/h (RPH)Outgoing calls to Ollama/LLM per unit time
Token ratetokens/s, tokens/min (TPM)Tokens consumed or generated per unit time
Data ratebyte/s (B/s), kB/sPayload bytes per unit time

2.4 Counts and ratios

QuantityUnitDescription
Total requests1Cumulative count
Success / failure1Counts or ratio (dimensionless)
Rate limit hits1Count of requests delayed or blocked by limiter
Utilization0–1 or %e.g. token bucket utilization, queue depth / max

3. Where metrics are collected

3.1 Inbound (clients → mindX)

3.2 Outbound (mindX → Ollama / LLMs)

3.3 Services (internal)


4. API and config


5. Summary


All DocumentsDocument IndexThe Book of mindXImprovement JournalAPI Reference