Guard Rails¶
Overview¶
RoboDev provides layered safety boundaries — guard rails — to ensure autonomous AI agents operate within enterprise-approved limits. Guard rails are applied at multiple levels for defence in depth.
New to guard rails?
For a plain-language introduction, see Guard Rails Overview. This page covers the detailed configuration reference.
sequenceDiagram
participant Ticket as Incoming Ticket
participant L1 as 1. Controller Validation
participant L2 as 2. Engine Hooks
participant L3 as 3. Quality Gate
participant L4 as 4. Watchdog
Ticket->>L1: Check allowed repos, task types, limits
L1->>L2: Pass — launch agent
Note over L2: Intercept tool calls<br/>in real time
L2->>L3: Agent finishes
Note over L3: Scan for secrets,<br/>OWASP patterns
L3->>L4: Continuous monitoring
Note over L4: Detect loops, stalls,<br/>cost overruns
1. Controller-Level Guards¶
Applied before a job is created. Configured in robodev-config.yaml:
guardrails:
max_cost_per_job: 50.00
max_concurrent_jobs: 5
max_job_duration_minutes: 120
allowed_repos:
- "org/frontend-*"
- "org/backend-*"
blocked_file_patterns:
- "*.env"
- "**/secrets/**"
- "**/credentials/**"
require_human_approval_before_mr: false
allowed_task_types:
- "dependency_upgrade"
- "test_fix"
- "bug_fix"
- "documentation"
What Each Guard Does¶
| Guard | Effect |
|---|---|
max_cost_per_job |
Terminates jobs exceeding the USD budget |
max_concurrent_jobs |
Queues new tickets when limit is reached |
max_job_duration_minutes |
Sets activeDeadlineSeconds on K8s Jobs |
allowed_repos |
Rejects tickets for repositories not matching glob patterns |
blocked_file_patterns |
Injected into engine hooks to prevent modification |
require_human_approval_before_mr |
Pauses before PR creation for human sign-off |
allowed_task_types |
Rejects tickets with disallowed task types |
2. Engine-Level Guards (Claude Code Hooks)¶
Only applies to Claude Code
Engine hooks are only available for the Claude Code engine. Other engines (Codex, Aider, OpenCode, Cline) rely on prompt-based rules which are advisory, not enforced.
Applied inside the execution container via Claude Code's hooks system. RoboDev generates a settings.json file mounted into the container:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "/opt/robodev/hooks/block-dangerous-commands.sh"
}
]
},
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "/opt/robodev/hooks/block-sensitive-files.sh"
}
]
}
],
"PostToolUse": [
{
"hooks": [
{
"type": "command",
"command": "/opt/robodev/hooks/heartbeat.sh"
}
]
}
]
}
}
Blocked Commands¶
The block-dangerous-commands.sh hook blocks:
- rm -rf / and similar destructive commands
- curl | bash, wget | bash (remote code execution)
- eval with untrusted input
- sudo (privilege escalation)
- chmod 777 (insecure permissions)
- git push --force to main/master
Blocked Files¶
The block-sensitive-files.sh hook blocks writes to:
- .env* files
- **/credentials/**
- **/secrets/**
- *.pem, *.key (private keys)
Custom patterns can be added via the BLOCKED_FILE_PATTERNS environment variable.
3. Custom Guard Rails via Markdown (Planned)¶
Not yet wired
The guardrails.md injection path is not currently wired in the controller. The TaskProfileConfig struct has a Workflow field and the promptbuilder package exists, but the controller builds execution specs directly from ticket fields rather than routing through the promptbuilder. This feature is on the roadmap.
The intention is that users will provide a guardrails.md file (mounted from a ConfigMap) that the prompt builder appends to every agent prompt, giving the agent advisory rules such as:
# Guard Rails
## Never Do
- Never modify CI/CD pipeline configuration files
- Never change database migration files
## Always Do
- Always run the full test suite before creating an MR
4. Per-Task-Type Permission Profiles (Partially Implemented)¶
Config schema only
task_profiles is present in the config schema and values are stored, but per-task-type file pattern restrictions (allowed_file_patterns, blocked_file_patterns) are not enforced at runtime. The controller reads AllowedTaskTypes for validation but does not yet apply profile-level constraints to agent pods.
The task_profiles config structure is defined for future enforcement:
guardrails:
task_profiles:
dependency_upgrade:
allowed_file_patterns:
- "pyproject.toml"
- "requirements*.txt"
max_cost_per_job: 30.00
max_job_duration_minutes: 60
bug_fix:
blocked_file_patterns:
- "**/migrations/**"
- "**/auth/**"
max_cost_per_job: 50.00
documentation:
allowed_file_patterns:
- "*.md"
- "docs/**"
blocked_commands:
- "git push"
max_cost_per_job: 10.00
The controller selects the profile based on ticket labels or the ticket_type field from the ticketing backend.
5. Quality Gate¶
An optional post-completion review that runs as a separate K8s Job:
quality_gate:
enabled: true
mode: "post-completion"
engine: claude-code
max_cost_per_review: 5.00
security_checks:
scan_for_secrets: true
check_owasp_patterns: true
verify_guardrail_compliance: true
check_dependency_cves: true
on_failure: "retry_with_feedback"
The quality gate is read-only — it cannot modify the repository.
6. Progress Watchdog¶
Detects agents that are stalled, looping, or unproductive during execution:
progress_watchdog:
enabled: true
check_interval_seconds: 60
min_consecutive_ticks: 2
research_grace_period_minutes: 5
loop_detection_threshold: 10
thrashing_token_threshold: 80000
stall_idle_seconds: 300
cost_velocity_max_per_10_min: 15.00
unanswered_human_timeout_minutes: 30
Detection Rules¶
| Rule | Detects | Action |
|---|---|---|
| Loop detection | Same tool call repeated N times | Terminate with feedback |
| Thrashing | High token use, no file changes | Warn, then terminate |
| Stall | No tool calls for N seconds | Terminate |
| Cost velocity | Spending > $X per 10 minutes | Warn |
| Telemetry failure | Heartbeat stopped advancing | Warn |
| Unanswered human | NeedsHuman with no response | Terminate and notify |
All terminate actions require the anomaly to persist for at least min_consecutive_ticks checks to prevent false positives.