We have successfully enhanced the monitoring system with comprehensive CPU, RAM, API token usage, and rate limiter monitoring capabilities. The system now provides production-ready monitoring with real-time accurate data and configurable verbosity options.
llm/rate_limiter.py)New Features:
API Enhancements:
# Enhanced rate limiter with monitoring
rate_limiter = RateLimiter(
requests_per_minute=60,
monitoring_callback=monitoring_callback # New: Real-time metrics
)
Get comprehensive metrics
metrics = rate_limiter.get_metrics()
Returns: success_rate, block_rate, avg_wait_time_ms, token_utilization, percentiles
Get human-readable status
status = rate_limiter.get_status_summary()
Returns: rate_limit, current_tokens, utilization, status (healthy/degraded/critical)
monitoring/enhanced_monitoring_system.py)Detailed CPU and RAM Monitoring:
API Token Usage Tracking:
Rate Limiter Performance Monitoring:
New Monitoring Methods:
# API Token Usage Tracking
await monitoring_system.log_api_token_usage(
model_name="gpt-4",
provider="openai",
prompt_tokens=150,
completion_tokens=200,
cost_usd=0.005,
success=True,
rate_limited=False,
metadata={"agent_id": "my_agent"}
)
Rate Limiter Metrics Logging
await monitoring_system.log_rate_limiter_metrics(
provider="openai",
model_name="gpt-4",
rate_limiter_metrics=rate_limiter.get_metrics()
)
Get comprehensive summaries
api_summary = await monitoring_system.get_api_usage_summary()
limiter_summary = await monitoring_system.get_rate_limiter_summary()
New Alert Categories:
Multi-level Severity:
API Usage Analytics:
Rate Limiter Analytics:
Flexible Configuration Options:
{
"monitoring": {
"interval_seconds": 30.0,
"thresholds": {
"swap_critical": 80.0,
"swap_warning": 60.0
},
"api": {
"daily_cost_threshold": 100.0,
"rate_limit_threshold": 10
}
}
}
Verbosity Levels:
CPU: 0.0% (4 logical cores, 2 physical cores)
Memory: 5.5GB used / 7.6GB total (72.1%)
Swap: 2.0GB used / 2.0GB total (99.9%)
Per-core CPU: [20.0%, 30.0%, 20.0%, 12.5%]
CPU Frequency: 2400MHz (current), 800MHz (min), 3600MHz (max)
Total API Cost: $0.086
Total Tokens: 3,935 (2,045 prompt + 1,890 completion)
Providers: OpenAI, Anthropic, Gemini
Efficiency Ratios: 0.924 (gpt-4), 0.067 (claude-3)
Rate Limit Hits: 3 (gemini-pro)
OpenAI/GPT-4: healthy (100.0% success, 60/min rate)
Anthropic/Claude-3: healthy (100.0% success, 10/min rate)
Gemini/Gemini-Pro: critical (37.5% success, 2/min rate)
Overall Health: degraded (due to gemini rate limiting)
# Enhanced Gemini handler with comprehensive monitoring
async def generate_text(self, prompt, model, **kwargs):
start_time = time.time()
# Rate limiter with monitoring
if not await self.rate_limiter.wait():
await monitoring_system.log_api_token_usage(
model_name=model, provider="gemini",
prompt_tokens=0, completion_tokens=0,
rate_limited=True, success=False
)
raise RateLimitError("Rate limit exceeded")
try:
response = await self._api_call(prompt, model)
# Log successful usage with full metrics
await monitoring_system.log_api_token_usage(
model_name=model,
provider="gemini",
prompt_tokens=response.usage.prompt_tokens,
completion_tokens=response.usage.completion_tokens,
cost_usd=self._calculate_cost(response.usage),
success=True,
rate_limited=False
)
return response.text
except Exception as e:
# Log failed usage
await monitoring_system.log_api_token_usage(
model_name=model, provider="gemini",
prompt_tokens=0, completion_tokens=0,
success=False, error_type=type(e).__name__
)
raise
# BDI Agent with performance tracking
async def execute_action(self, action):
start_time = time.time()
try:
result = await self._perform_action(action)
execution_time = (time.time() - start_time) 1000
await monitoring_system.log_agent_performance(
agent_id=self.agent_id,
action_type=action.type,
execution_time_ms=execution_time,
success=True,
metadata={"complexity": action.complexity}
)
return result
except Exception as e:
execution_time = (time.time() - start_time) 1000
await monitoring_system.log_agent_performance(
agent_id=self.agent_id,
action_type=action.type,
execution_time_ms=execution_time,
success=False,
metadata={"error": type(e).__name__}
)
raise
{
"monitoring": {
"interval_seconds": 60,
"memory_logging_enabled": true,
"thresholds": {
"cpu_critical": 90, "memory_critical": 85,
"swap_critical": 80, "disk_critical": 90
},
"api": {
"daily_cost_threshold": 100.0,
"rate_limit_threshold": 10
}
}
}
{
"monitoring": {
"interval_seconds": 15,
"min_alert_severity": "INFO",
"log_performance_details": true
}
}
The Enhanced Monitoring System now provides enterprise-grade monitoring with:
The system successfully captures actual resource and performance data with configurable verbosity, providing the comprehensive monitoring capabilities requested.