optimized_audit_gen_agent.md · 5.2 KB
Optimized Audit Gen Agent Documentation
Overview
The OptimizedAuditGenAgent is a specialized version of BaseGenAgent optimized for code auditing. It addresses the "giant file problem" through smart filtering, chunking, and audit-focused analysis, generating manageable documentation chunks with focus on code quality, maintainability, and audit insights.
File: tools/optimized_audit_gen_agent.py
Class: OptimizedAuditGenAgent
Version: 1.0.0
Status: ✅ Active
Architecture
Design Principles
- Audit-Focused: Optimized specifically for code auditing
- Chunking Strategy: Splits output into manageable chunks
- Smart Filtering: Enhanced file filtering for audit focus
- Code Quality Analysis: Detects code smells and issues
- Security Analysis: Identifies security vulnerabilities
Core Components
class OptimizedAuditGenAgent:
- memory_agent: MemoryAgent - For workspace management
- max_file_size_kb: int - Maximum file size (default: 500KB)
- max_files_per_chunk: int - Files per chunk (default: 50)
- output_dir: Path - Audit reports directory
Features
1. Smart File Filtering
Enhanced exclude patterns for audit focus:
- Memory and log files (major bloat sources)
- Binary and media files
- Documentation files (reduce noise)
- Build artifacts
- Test artifacts
2. Code Quality Detection
Detects code smells:
- Long functions (>50 lines)
- Deep nesting (>4 levels)
- Magic numbers
- Hardcoded strings
- TODO/FIXME comments
- Long lines (>120 chars)
- Missing docstrings
3. Security Pattern Detection
Identifies security issues:
- SQL injection risks
- Command injection risks
- Hardcoded secrets
- eval() usage
- pickle usage
4. Chunking Strategy
Prevents giant output files:
- Splits files into chunks
- Configurable chunk size
- Maintains file relationships
- Generates index
Usage
Generate Audit Documentation
from tools.optimized_audit_gen_agent import OptimizedAuditGenAgent
from agents.memory_agent import MemoryAgent
memory_agent = MemoryAgent()
audit_agent = OptimizedAuditGenAgent(
memory_agent=memory_agent,
max_file_size_kb=500,
max_files_per_chunk=50
)
Generate audit documentation
success, result = audit_agent.generate_audit_documentation("/path/to/codebase")
if success:
print(f"Main report: {result['main_report']}")
print(f"Files analyzed: {result['files_analyzed']}")
print(f"Chunks created: {result['chunks_created']}")
CLI Usage
python tools/optimized_audit_gen_agent.py /path/to/codebase \
--max-file-size 500 \
--max-files-per-chunk 50
Output Structure
Main Report
audit_report_{directory}_{timestamp}.md
Contains:
- Summary statistics
- File index
- Links to chunk files
- File list
Chunk Files
audit_chunk_{number:03d}_{directory}_{timestamp}.md
Each chunk contains:
- File contents
- Language-specific formatting
- Code blocks
- File paths
Configuration
Parameters
max_file_size_kb (int, default: 500): Maximum file size in KB
max_files_per_chunk (int, default: 50): Maximum files per chunk
agent_id (str, default: "optimized_audit_gen_agent"): Agent identifier
Exclude Patterns
Optimized excludes for audit:
- Memory/log files
- Binary files
- Documentation files
- Build artifacts
- Test artifacts
Limitations
Current Limitations
- No LLM Analysis: Doesn't use LLM for analysis
- Basic Detection: Simple pattern matching
- No Metrics: Doesn't calculate metrics
- No Recommendations: No improvement suggestions
- Static Patterns: Fixed pattern set
Recommended Improvements
- LLM Integration: Use LLM for deeper analysis
- Metrics Calculation: Calculate complexity metrics
- Recommendations: Generate improvement suggestions
- Custom Patterns: Support custom patterns
- Incremental Analysis: Only analyze changes
- Visualization: Charts and graphs
- Integration: Integrate with other tools
Integration
With Memory Agent
Uses memory agent for:
- Workspace management
- Output directory
- Data storage
Examples
Custom Configuration
audit_agent = OptimizedAuditGenAgent(
memory_agent=memory_agent,
max_file_size_kb=1000, # Larger files
max_files_per_chunk=100 # More files per chunk
)
Technical Details
Dependencies
agents.memory_agent.MemoryAgent: Workspace management
pathspec: File pattern matching (optional)
core.belief_system.BeliefSystem: Belief system (optional)
Code Smell Patterns
Uses regex patterns to detect:
- Long functions
- Deep nesting
- Magic numbers
- Hardcoded strings
- TODO/FIXME comments
- Long lines
- Missing docstrings
Security Patterns
Detects security issues:
- SQL injection
- Command injection
- Hardcoded secrets
- eval() usage
- pickle usage
Future Enhancements
- LLM Analysis: Deep LLM-powered analysis
- Metrics Framework: Comprehensive metrics
- Recommendations Engine: Auto-generate suggestions
- Custom Patterns: User-defined patterns
- Incremental Mode: Only analyze changes
- Visualization: Charts and graphs
- Integration: Better tool integration