wololo
Get access

Closed-Loop Learning

The pipeline doesn't just ship code — it learns from every cycle. Corrections, failures, and successful patterns are captured, classified, and fed back into agent behavior. This is what makes the pipeline continuously improving, not just continuously running.

The Learning Loop

ContinuousImprovementevery pipeline cyclePipeline Runissue → PR → mergeCorrectionscorrections.md loggedPattern PromotionHOT tier → patterns.jsonlFleet Gossipdiscoveries propagate fleetSmarter Agentsnext session starts primedThe fleet gets measurably smarter with every completed pipeline cycle

Correction Logging

When an agent gets corrected — by a human, by another agent, or by a build failure — it logs the correction immediately, before doing anything else.

Correction signals

  • Human says "no", "wrong", "actually", "stop", "don't", "should be", "instead"
  • Another agent corrects the approach
  • A build or test fails because of something the agent did
  • The agent realizes it made a mistake

What gets logged

## 2026-03-15 14:30 — Used sed instead of awk for template substitution
**What I got wrong:** Used sed to substitute issue titles into PR templates,
  which broke on titles containing special characters (/, &, etc.)
**Correct approach:** Use awk for template substitution — it handles
  special characters without escaping
**Source:** Build failure in pipeline-dispatch.sh
**Pattern:** Always use awk over sed for user-provided input substitution

Pattern Promotion

Not every correction becomes a permanent lesson. The system uses a tiered approach:

TierWhereWhenLoaded
Raw correctionscorrections.mdEvery correction, immediatelyOn review
HOT memorymemory.md3+ similar corrections on the same patternEvery session start
Cross-agentpatterns.md (shared)During consolidation cyclesBefore starting work on a new topic

Why 3 corrections?

One correction could be a fluke. Two is a coincidence. Three is a pattern. Only patterns that repeat get promoted to HOT memory, which is loaded at every session start. This keeps the always-loaded context small and high-signal.

Mutation Protocol

When an agent fails 3 consecutive times on the same problem, it must stop and change strategy. This is the mutation protocol — a forced pivot to prevent repeated failure.

Mutation strategies

StrategyWhen to Use
Reduce scopeTask is too large — break it into smaller pieces
Change toolCurrent tool isn't working — switch to fallback
Simplify approachOver-engineering — do the minimum viable fix
Ask for helpMissing context — escalate to human or specialist agent
Reframe the problemSolving the wrong thing — re-read the issue
Work aroundDirect fix isn't possible — find an alternative path

Cross-Agent Learning

Agents don't learn in isolation. A shared patterns file captures learnings that apply across the entire agent fleet. Before starting work on a new topic, agents search shared learnings:

# Before working on authentication:
grep -A5 "auth" ~/learnings/patterns.md

# Before working on database migrations:
grep -A5 "migration" ~/learnings/patterns.md

This means when one agent learns that "transcrypt needs re-initialization in worktrees,"every agent benefits from that knowledge on their next relevant task.

Consolidation Cycles

During scheduled maintenance windows (sleep cycles), agents review their corrections and perform housekeeping:

  1. Review corrections.md — look for patterns (3+ similar → promote to HOT)
  2. Scan recent sessions for self-observed improvements
  3. Check memory.md size — if >100 lines, archive least-used patterns
  4. Promote successful mutation strategies to shared patterns

Pipeline-Specific Learnings

The pipeline itself generates learnings at every stage:

StageWhat Gets Learned
ResearchWhich issues are agent-solvable vs need humans — improves classification accuracy
BuildCommon build failures, worktree setup issues, test patterns
ReviewWhich review comments are valid vs noise — improves fix prioritization
FixFix patterns that work vs fail — reduces fix cycle count
E2ETest coverage gaps, browser automation patterns

Measuring Improvement

The pipeline tracks metrics that indicate whether the learning loop is working:

  • Fix cycle count — average number of fix attempts before comments are resolved. Should decrease over time.
  • Agent-stuck rate — percentage of pipelines that hit the circuit breaker. Should decrease.
  • Research accuracy — percentage of agent-classified issues that complete without human intervention.
  • Time to PR — elapsed time from issue creation to PR opened.
  • Pattern count — number of patterns in HOT memory. Growth means the system is learning; plateau means it's stabilizing.