optimized_audit_gen_agent.md · 5.2 KB
Optimized Audit Gen Agent Documentation
Overview
The OptimizedAuditGenAgent is a specialized version of BaseGenAgent optimized for code auditing. It addresses the "giant file problem" through smart filtering, chunking, and audit-focused analysis, generating manageable documentation chunks with focus on code quality, maintainability, and audit insights.
File: tools/optimized_audit_gen_agent.py
Class: OptimizedAuditGenAgent
Version: 1.0.0
Status: ✅ Active
Architecture
Design Principles
Audit-Focused: Optimized specifically for code auditing
Chunking Strategy: Splits output into manageable chunks
Smart Filtering: Enhanced file filtering for audit focus
Code Quality Analysis: Detects code smells and issues
Security Analysis: Identifies security vulnerabilities
Core Components
class OptimizedAuditGenAgent:
- memory_agent: MemoryAgent - For workspace management
- max_file_size_kb: int - Maximum file size (default: 500KB)
- max_files_per_chunk: int - Files per chunk (default: 50)
- output_dir: Path - Audit reports directory
Features
1. Smart File Filtering
Enhanced exclude patterns for audit focus:
Memory and log files (major bloat sources)
Binary and media files
Documentation files (reduce noise)
Build artifacts
Test artifacts
2. Code Quality Detection
Detects code smells:
Long functions (>50 lines)
Deep nesting (>4 levels)
Magic numbers
Hardcoded strings
TODO/FIXME comments
Long lines (>120 chars)
Missing docstrings
3. Security Pattern Detection
Identifies security issues:
SQL injection risks
Command injection risks
Hardcoded secrets
eval() usage
pickle usage
4. Chunking Strategy
Prevents giant output files:
Splits files into chunks
Configurable chunk size
Maintains file relationships
Generates index
Usage
Generate Audit Documentation
from tools.optimized_audit_gen_agent import OptimizedAuditGenAgent
from agents.memory_agent import MemoryAgent
memory_agent = MemoryAgent()
audit_agent = OptimizedAuditGenAgent(
memory_agent=memory_agent,
max_file_size_kb=500,
max_files_per_chunk=50
)
Generate audit documentation
success, result = audit_agent.generate_audit_documentation("/path/to/codebase")
if success:
print(f"Main report: {result['main_report']}")
print(f"Files analyzed: {result['files_analyzed']}")
print(f"Chunks created: {result['chunks_created']}")
CLI Usage
python tools/optimized_audit_gen_agent.py /path/to/codebase \
--max-file-size 500 \
--max-files-per-chunk 50
Output Structure
Main Report
audit_report_{directory}_{timestamp}.md
Contains:
Summary statistics
File index
Links to chunk files
File list
Chunk Files
audit_chunk_{number:03d}_{directory}_{timestamp}.md
Each chunk contains:
File contents
Language-specific formatting
Code blocks
File paths
Configuration
Parameters
max_file_size_kb (int, default: 500): Maximum file size in KB
max_files_per_chunk (int, default: 50): Maximum files per chunk
agent_id (str, default: "optimized_audit_gen_agent"): Agent identifier
Exclude Patterns
Optimized excludes for audit:
Memory/log files
Binary files
Documentation files
Build artifacts
Test artifacts
Limitations
Current Limitations
No LLM Analysis: Doesn't use LLM for analysis
Basic Detection: Simple pattern matching
No Metrics: Doesn't calculate metrics
No Recommendations: No improvement suggestions
Static Patterns: Fixed pattern set
Recommended Improvements
LLM Integration: Use LLM for deeper analysis
Metrics Calculation: Calculate complexity metrics
Recommendations: Generate improvement suggestions
Custom Patterns: Support custom patterns
Incremental Analysis: Only analyze changes
Visualization: Charts and graphs
Integration: Integrate with other tools
Integration
With Memory Agent
Uses memory agent for:
Workspace management
Output directory
Data storage
Examples
Custom Configuration
audit_agent = OptimizedAuditGenAgent(
memory_agent=memory_agent,
max_file_size_kb=1000, # Larger files
max_files_per_chunk=100 # More files per chunk
)
Technical Details
Dependencies
agents.memory_agent.MemoryAgent: Workspace management
pathspec: File pattern matching (optional)
core.belief_system.BeliefSystem: Belief system (optional)
Code Smell Patterns
Uses regex patterns to detect:
Long functions
Deep nesting
Magic numbers
Hardcoded strings
TODO/FIXME comments
Long lines
Missing docstrings
Security Patterns
Detects security issues:
SQL injection
Command injection
Hardcoded secrets
eval() usage
pickle usage
Future Enhancements
LLM Analysis: Deep LLM-powered analysis
Metrics Framework: Comprehensive metrics
Recommendations Engine: Auto-generate suggestions
Custom Patterns: User-defined patterns
Incremental Mode: Only analyze changes
Visualization: Charts and graphs
Integration: Better tool integration