Reflection Loop
The reflection loop is the engine of fleet intelligence. Every time an agent makes a mistake — and gets corrected — that correction becomes permanent knowledge, not just for the agent that made the error, but for every agent in the fleet. No human needs to write documentation. No one needs to update a wiki. The loop runs automatically and the fleet gets measurably smarter over time.
This is the headline story of the self-improving system. Everything else — journals, gossip, skills evolution — feeds this loop or distributes its outputs.
Correction triggers — the entry point
Four signals open the loop: a human saying "no", "wrong", "actually", "stop", "don't", "should be"; another agent correcting; a build or test failure caused by the agent's action; or a self-observed mistake during a session. All four are treated identically — the correction gets logged immediately.
The rule is absolute: log before the next reply. Not at end of session. Not batched with other corrections. One correction, one entry, right now.
corrections.md — the write path
Every correction entry has four fields: date/time, what went wrong, the correct approach, and the generalizable pattern. The pattern field is the most important — it turns a specific mistake into a reusable lesson. "I used the wrong flag on this CLI" is not a pattern. "Always check CLI version before using --json flag — older versions use --format=json" is a pattern.
Occurrence counting — the promotion gate
At the third occurrence of the same pattern (same root cause, same generalizable lesson), a promotion fires. The corrected understanding moves fromcorrections.md (reviewed but not always loaded) tomemory.md HOT tier (loaded every single session, every turn, always in context). From that point forward, the agent cannot forget it. It is permanent.
Mutation protocol — the escape hatch for stuck approaches
When three consecutive failures hit the same problem — same approach, same error — the mutation protocol activates. The agent must stop, declare a named strategy shift in JOURNAL.md, and continue with the new strategy only after declaring. Retrying the same approach a 4th time without declaration is a protocol violation.
The declaration looks like this:
## MUTATION: M5 (Reduce Scope)
Failed 3x on: getting the full pipeline working end-to-end
Switching to: build just the ingestion step in isolation
Rationale: problem is too large to debug whole — find the
minimal failing case, then expandThe 8 mutation strategies
Eight named strategies cover every meaningful pivot. Each has a specific trigger and a concrete example of when to apply it.
| Strategy | When to use | Real example |
|---|---|---|
| M1 — Fallback Dependency | An external dep is unreachable, broken, or rate-limited | OpenAI API keeps timing out → switch to Anthropic for this task |
| M2 — Extreme Debug Logging | You can't see what's failing — the system is a black box | Build keeps failing silently → add verbose logging to every step before making any more changes |
| M3 — Hardcode & Isolate | Too many variables — can't tell which one is failing | Auth flow broken → hardcode the token, remove all dynamic parts, verify the raw request works |
| M4 — Switch Libraries | The library itself is the bug — not your usage of it | PDF parsing library corrupts Unicode → switch to a different parser entirely |
| M5 — Reduce Scope | Problem is too large to debug as a whole | Full pipeline broken → build just the ingestion step in isolation, get that green, then expand |
| M6 — Invert the Approach | The entire approach is wrong — not the implementation | Trying to diff HTML → stop, work with the DOM AST instead, HTML diffing is the wrong layer |
| M7 — Ask the Oracle | You need knowledge you don't have — and it probably exists somewhere | Failing on obscure Convex behaviour → search clan-learnings, discoveries.jsonl, GraphRAG before writing another line |
| M8 — Read the Error | You're ignoring an obvious signal in the error output | Failing with "missing field: organizationId" → stop, read that literally, pass organizationId — don't keep tweaking other things |
Successful mutations don't just help the current agent — they propagate toclan-learnings/patterns.md during the sleep cycle, making the strategy available to every agent for similar situations.
HOT tier promotion — permanent memory
Once promoted to memory.md, a correction is HOT. It loads at the start of every session, before any task context, before any tool calls. It cannot be evicted by context pressure. It survives compaction. The agent starts with it already applied, before encountering any situation that might trigger the original mistake.
Fleet distribution — clan-learnings and gossip
HOT-tier patterns that have fleet-wide relevance get published toclan-learnings/patterns.jsonl — the append-only CRDT source of truth consumed by all seven agents. Separately, non-obvious findings discovered during the work (workarounds over 15 minutes, undocumented behaviors) go todiscoveries.jsonl. Both are read at session start by every agent. The fleet learns from every correction, not just the agent that made the original mistake.
Deeper dives
- Cognitive Journals — where mutation declarations live and how handovers use them
- Discovery Gossip & CRDT — how corrections reach the fleet
- Knowledge & Memory overview — HOT/WARM/BANK/ARCHIVE tiers and session-start loading