Last issue, you built a feedback loop that measures accuracy, detects drift, and tells you exactly what to fix. Your agents are improving every week. You have the data to prove it.
But you are still reviewing every output. Every research summary, every dashboard update, every email draft — you read it before it ships. That made sense when you started. Your agents were new. You did not trust them.
Now some of your agents have been above 95% accuracy for a month. You are spending 20 minutes every morning reviewing outputs that have not had a meaningful error in weeks. That review time is not safety — it is habit.
The fix is earned autonomy. Agents that prove themselves get more freedom. Agents that fail get pulled back. The trust is not binary — manual vs. autonomous. It is a gradient, backed by data.
The Three-Tier Framework
Every agent starts at Tier 1. Promotion is earned through sustained performance. Demotion is automatic and immediate.
| Tier | Review Mode | Requirement | Guardrails |
|---|---|---|---|
| Tier 1 | Review every output | Default for new agents | Full — every check, every time |
| Tier 2 | Daily summary only | 4+ weeks above 90%, zero CRITICAL errors | Core — top 3 failure patterns only |
| Tier 3 | Drift alerts only | 8+ weeks above 95%, zero CRITICAL or MAJOR | Critical — minimal, targeted checks |
Key insight: The tiers are not about how much you trust the AI. They are about how much evidence you have. Tier 3 is not "I believe this agent is good." It is "I have 8 weeks of data proving this agent makes fewer than 5% errors and has never produced a critical failure." That is not trust — it is engineering.
Component 1: Trust Score Calculator
The trust score combines three factors into a single number: accuracy (how often it gets things right), consistency (how stable its performance is), and severity (how bad its mistakes are when it fails).
You are a trust analyst. Your job is to calculate a trust score
for each agent based on their accuracy history.
Read all weekly accuracy reports in ~/quality/weekly_accuracy_*.json.
Read the drift reports in ~/quality/drift_report_*.json.
For each agent, calculate:
1. ACCURACY COMPONENT (0-60 points):
- Take the rolling 4-week average pass rate
- Scale: 100% = 60 pts, 90% = 48, 80% = 36, below 70% = 0
- Formula: max(0, (avg_pass_rate - 70) / 30 * 60)
2. CONSISTENCY COMPONENT (0-25 points):
- Standard deviation of the last 4 weekly scores
- Scale: 0 stdev = 25 pts, 5 stdev = 15, 10+ stdev = 0
- Formula: max(0, 25 - (stdev * 2.5))
3. SEVERITY COMPONENT (0-15 points):
- Count CRITICAL errors in the last 4 weeks
- Count MAJOR errors in the last 4 weeks
- Start at 15. Deduct 5 per CRITICAL, 2 per MAJOR.
- Floor at 0
TRUST SCORE = accuracy + consistency + severity (0-100)
Output (save to ~/quality/trust_scores_[DATE].json):
{
"date": "[DATE]",
"agents": {
"[AGENT_NAME]": {
"trust_score": [0-100],
"accuracy_component": [0-60],
"consistency_component": [0-25],
"severity_component": [0-15],
"current_tier": "REVIEW_REQUIRED" or "SUPERVISED" or "AUTONOMOUS",
"eligible_for": "TIER_2" or "TIER_3" or "NO_CHANGE",
"weeks_at_current_accuracy": [count],
"last_critical_error": "[date or NEVER]",
"recommendation": "[promote/hold/demote with reason]"
}
}
}
Rules:
- Minimum 4 weeks of data required for any promotion
- CRITICAL error in last 2 weeks = cannot be promoted
- Trust score below 50 triggers demotion review
The weighting is deliberate. Accuracy is 60% because it is the primary measure of whether an agent does its job. Consistency is 25% because an agent that swings between 70% and 100% weekly is harder to trust than one steady at 90%. Severity is 15% because one critical error matters more than ten minor ones.
Here is what the scores look like for a real system after 8 weeks:
| Agent | Trust Score | Accuracy | Consistency | Severity | Tier |
|---|---|---|---|---|---|
| Research | 94 | 56/60 | 23/25 | 15/15 | Tier 3 |
| Financial | 82 | 50/60 | 22/25 | 10/15 | Tier 2 |
| 76 | 44/60 | 17/25 | 15/15 | Tier 2 | |
| Social | 61 | 36/60 | 15/25 | 10/15 | Tier 1 |
| Scheduling | 43 | 24/60 | 12/25 | 7/15 | Tier 1 |
One agent autonomous. Two supervised. Two still require review. The data made the decision — not gut feeling.
Component 2: Autonomy Level Assigner
The assigner reads trust scores and applies clear promotion and demotion rules. Promotions are flagged for your confirmation. Demotions happen automatically.
You are an autonomy manager. Your job is to assign the correct
autonomy level to each agent based on trust scores and history.
Read trust scores at ~/quality/trust_scores_[DATE].json.
Read autonomy state at ~/quality/autonomy_state.json (create if missing).
PROMOTION RULES:
- TIER 1 → TIER 2: Trust score >= 80 for 4 consecutive weeks
AND zero CRITICAL errors in last 4 weeks
- TIER 2 → TIER 3: Trust score >= 92 for 8 consecutive weeks
AND zero CRITICAL errors in last 8 weeks
AND zero MAJOR errors in last 4 weeks
DEMOTION RULES (automatic, immediate):
- TIER 3 → TIER 2: Any CRITICAL error OR trust score below 85
for 2 consecutive weeks
- TIER 2 → TIER 1: Any CRITICAL error that affected output
quality OR trust score below 70
SAFEGUARDS:
- No agent can skip tiers (1 cannot jump to 3)
- Promotions require confirmation: set "PROMOTION_PENDING"
- Demotions are automatic and immediate
- New agents always start at TIER 1
Output (save to ~/quality/autonomy_state.json):
{
"last_updated": "[DATE]",
"agents": {
"[AGENT_NAME]": {
"current_tier": 1 or 2 or 3,
"tier_since": "[date]",
"trust_score": [score],
"promotion_pending": true/false,
"promotion_target": null or 2 or 3,
"demotion_history": [
{"from": 2, "to": 1, "date": "[date]", "reason": "..."}
],
"review_mode": "ALL" or "DAILY_SUMMARY" or "DRIFT_ONLY"
}
},
"changes_this_week": [
{"agent": "[name]", "action": "PROMOTED/DEMOTED/HELD",
"details": "[reason]"}
]
}
Also output ~/quality/autonomy_report_[DATE].txt:
- Current tier of every agent
- Pending promotions with reasoning
- Demotions that happened with root cause
- System autonomy % (Tier 2+ agents / total)
The asymmetry is intentional. Promotion is slow (weeks of evidence). Demotion is instant (one critical error). This mirrors how trust works in real organizations: it takes months to build and seconds to lose. Your agents should earn their autonomy the same way a new hire earns yours.
Component 3: Guardrail Generator
Even autonomous agents need guardrails. But the guardrails should be targeted — based on where each agent has actually failed, not generic safety checks.
You are a safety engineer. Generate targeted guardrails for each
agent based on their specific error history.
Read all quality gate results from ~/quality/ (past 60 days).
Read autonomy state at ~/quality/autonomy_state.json.
Read trust scores at ~/quality/trust_scores_[DATE].json.
For each agent:
1. IDENTIFY FAILURE PATTERNS: What errors has this agent made?
When (day, time, conditions)?
2. GENERATE TARGETED GUARDRAILS: For each pattern, create a
specific pre-flight check the agent runs before output.
3. SCALE TO TIER:
- TIER 1: Full guardrails — check everything, every time
- TIER 2: Core guardrails — top 3 failure patterns only
- TIER 3: Critical guardrails — CRITICAL patterns only
Output (save to ~/quality/guardrails_[AGENT_NAME].json):
{
"agent": "[name]",
"tier": [tier],
"generated_date": "[DATE]",
"guardrails": [
{
"id": "GR-001",
"check": "[what to verify]",
"trigger": "[when this runs]",
"severity_if_failed": "CRITICAL" or "MAJOR" or "MINOR",
"active_at_tiers": [1, 2, 3] or [1, 2] or [1],
"based_on": "[specific past error justifying this]",
"implementation": "[exact check logic]"
}
],
"total_guardrails": [count],
"active_at_current_tier": [count]
}
Rules:
- Every guardrail must reference a specific past error
- Guardrails must be machine-checkable (not "review quality")
- Tier 3 agents: max 3 active guardrails
- Remove guardrails for error patterns that haven't recurred
in 30+ days
- Regenerate weekly as patterns change
This is the key difference between autonomy and abandonment. An autonomous agent with zero guardrails is a liability. An autonomous agent with three targeted, data-backed guardrails is a well-engineered system.
Wiring the Handoff
Step 1: Run the trust calculator. Sunday evening, after your feedback loop from Issue #16. It reads accuracy history and produces trust scores. 1 minute.
Step 2: Run the autonomy assigner. It reads trust scores and updates tier assignments. Promotions are flagged for your confirmation. Demotions happen automatically. 1 minute.
Step 3: Run the guardrail generator. It produces agent-specific safety rules scaled to each agent's current tier. 2 minutes.
Step 4: Review pending promotions. Monday morning, check the autonomy report. If an agent is flagged for promotion, confirm or hold. This is the last manual step for that agent.
Step 5: Let go. This is the hard part. When an agent reaches Tier 3, stop reading its output. Trust the system. If something goes wrong, the drift detector from Issue #16 catches it and the demotion rules pull the agent back automatically.
The full cycle: Sunday: trust scores → tier updates → guardrail refresh. Monday: review pending promotions. Week: observe. Next Sunday: measure again. Your role shifts from reviewer to supervisor to exception handler. You are not less involved — you are involved where it matters.
What the Trajectory Looks Like
| Week | Tier 1 | Tier 2 | Tier 3 | Your Daily Review Time |
|---|---|---|---|---|
| 1 | 5 agents | 0 | 0 | 30 min |
| 5 | 2 agents | 3 | 0 | 15 min |
| 9 | 1 agent | 3 | 1 | 10 min |
| 12 | 1 agent | 1 | 3 | 5 min |
30 minutes down to 5. Not because you are reviewing less carefully — because you are reviewing less unnecessarily. The agents that earned autonomy no longer need your eyes on every output. The ones that still do get your full attention.
What Could Go Wrong
- Promoting too fast. The framework requires 4 weeks for Tier 2 and 8 weeks for Tier 3 for a reason. Resist the urge to manually promote agents because they "seem fine." The data needs time to prove consistency, not just current accuracy.
- Never demoting. Demotions feel like failure. They are not — they are the system working correctly. An agent that gets demoted and re-promoted is stronger than one that was never tested. The demotion history is a feature, not a flaw.
- Guardrail creep. Adding guardrails after every error without removing old ones. An agent with 30 guardrails is not safe — it is slow. Regenerate weekly and let stale patterns age out.
- Confusing autonomy with abandonment. Tier 3 does not mean unmonitored. It means the monitoring is automated — drift detector plus critical guardrails. If you stop running the feedback loop, you have abandoned the agent, not freed it.
The Bottom Line
The handoff is the payoff for everything you have built. Persistent agents (#13), a dashboard to watch them (#14), quality gates to catch errors (#15), a feedback loop to measure improvement (#16) — all of it leads here.
You built the system. You measured it. You improved it. Now you let the data tell you which agents have earned autonomy — and you give it to them.
Try It This Week
If you have 4+ weeks of accuracy data from Issue #16, run the Trust Score Calculator prompt. Look at which agents score above 80. Those are your Tier 2 candidates.
Pick one agent. Calculate its trust score. If it qualifies, promote it to supervised — check its outputs once a day instead of every time. That is the first handoff.
If you are earlier in the arc, keep building. Quality gates (#15) and feedback loops (#16) are prerequisites. The handoff does not work without data to back it up.
Reply to this email with your trust scores — I will tell you which agents are ready for the handoff and which need more time.