Agent Intelligence
Agent Intelligence is how agents remember, recall, learn, and propagate understanding across an entire fleet — permanently. It is not a database you configure. It is a closed loop: agents write observations, index them automatically, search them at inference time, get corrected when they're wrong, and make those corrections available to every other agent in the clan. The loop runs continuously, without human intervention, and the fleet gets measurably smarter over time.
The seven subsystems work as a single closed loop. Click any section below to dive deeper.
The seven subsystems at a glance
| Subsystem | What it does | Core mechanic |
|---|---|---|
| Reflection Loop | Agents learn from corrections and never repeat the same mistake | corrections.md → occurrence counter → HOT promotion → M1-M8 mutations |
| Knowledge Graph | Structural memory — entities, relationships, code sidecars indexed and searchable | entity extraction → bank/entities/ → graph index → GraphRAG traversal |
| Cognitive Journals | Per-session decision log for debugging, handovers, and mutation declarations | JOURNAL.md in every worktree → archive to clan-learnings on task done |
| Gossip & CRDT | What one agent learns, all agents know — without a coordinator | discoveries.jsonl → tail-20 at session start → CRDT merge → fleet-wide patterns |
| Research Loop | Gate on architectural decisions — hypothesis before building | PROPOSAL.md → PoC → blind review → Splinter accept/reject → clan-learnings |
| Skills Evolution | Repeated manual steps get packaged into reusable skills automatically | H1-H4 detection → SKILL-PROPOSAL.md → build → Velma 0.9 score gate → adopt |
| GraphRAG | Deep structural search — traverses entity relationships, not just text similarity | entity extraction → relationship indexing → graph traversal → structural results |
Everything is built on plain Markdown files. The indexes are derived and rebuilt on demand. Lose the index? Rebuild it. Change the embedding model? Re-embed. The source of truth is always files in git — readable, editable, version-controlled.
What agents know and when — the loading sequence
Every session assembles its context in a specific order. What loads first is always loaded. What loads later is conditional. What's searched is on-demand. Understanding this sequence is the key to understanding why agents remember what they remember.
HOT — always in context
Workspace files (SOUL.md, AGENTS.md, IDENTITY.md, TOOLS.md, HEARTBEAT.md) andMEMORY.md are injected into every session, every turn. These are the agent's permanent personality, standing orders, and curated long-term facts. Anything promoted to HOT tier stays there until explicitly archived. Target size: under 8k tokens.
WARM — loaded at session start
Today's and yesterday's daily logs (memory/YYYY-MM-DD.md) are read at session start for continuity — what happened recently, what was decided, what's in flight. The tail of clan-learnings/discoveries.jsonl (last 20 entries) is also loaded, giving the agent recent cross-fleet findings before any work begins.
BANK — searched on demand
Older logs, entity pages (bank/entities/), and session transcripts live in the BANK tier. They're not loaded automatically — they're retrieved by the search pipeline when a query matches. The hybrid retrieval system (see below) handles this transparently.
ARCHIVE — rarely accessed
Compacted old sessions, superseded patterns, and archived self-improvement entries live here. Accessible by explicit recall only.
How agents remember — write paths
Memory is append-only. Agents never overwrite — they append, date-stamp, and let the indexer pick up changes. Six write paths feed the knowledge system:
- Daily log append — decisions, observations, anything worth remembering go into
memory/YYYY-MM-DD.mdthroughout the session - Session compaction flush — when context nears limits, a memory flush writes a structured summary before the session compacts. Nothing is lost.
- corrections.md write — every correction, immediately, before the next reply. No batching, no delays.
- Entity page update — stable facts about people, projects, systems curated into
bank/entities/ - Gossip broadcast receive — incoming pattern/discovery entries from other agents via CRDT merge
- Discovery publish — non-obvious findings (workarounds > 15min, undocumented behaviors) written to
clan-learnings/discoveries.jsonl
How agents search — parallel retrieval
Search runs two parallel paths. Text retrieval (vector + BM25, or the QMD sidecar when active) and graph traversal (GraphRAG) run simultaneously. Results are merged, deduplicated, re-ranked by recency and diversity, and returned as cited context bundles — small, attributed snippets the agent can use and reference exactly.
Text retrieval — vector + BM25 (or QMD)
Vector and BM25 run simultaneously. Results merge at 70/30 weighting by default, then pass through temporal decay and MMR re-ranking. When the QMD sidecar is active, it replaces both — bundling BM25, vector search, and reranking into one local pipeline via Bun + node-llama-cpp.
BM25 / FTS5 (keyword)
SQLite FTS5 token matching. Under 10ms, no ML, no model required. Best for exact identifiers: ticket numbers, env var names, error strings, code symbols. Weak on paraphrased recall.
T2 — Vector search (semantic)
Chunks (~400 tokens) embedded with a configurable model (local GGUF, Ollama, OpenAI, Gemini, Voyage, Mistral). Cosine similarity matching. Best for paraphrased recall — "machine running the gateway" finds "Mac Studio gateway host." Weak on exact tokens.
T3 — QMD sidecar (hybrid + reranking)
BM25 + vector search + reranking in one local pipeline. Runs via Bun + node-llama-cpp, auto-downloads GGUF models, managed as a gateway subprocess. Best of both modes with a unified interface.
T4 — GraphRAG (structural)
Entity-aware traversal over a knowledge graph with 129 entities and 131 relationships. Answers structural queries: "Who worked on the auth system?", "What depends on payments?", "Show everything related to the migration." Complements vector and keyword — finds information that is structurally related, not just textually similar.
Post-processing: merge, decay, re-rank
All four result sets merge with score fusion and deduplication. Temporal decay applies an exponential recency boost (30-day half-life) — recent memories score higher. Evergreen files (MEMORY.md, entity pages) are never decayed. MMR re-ranking (Maximal Marginal Relevance) then ensures top results cover diverse aspects rather than repeating the same information.
How agents learn — the closed-loop reflection system
This is the core contribution: a self-improving loop that turns every correction into permanent fleet-wide knowledge. The loop is automatic. No human intervention required after the initial correction.
The trigger is any correction — a human saying "no", another agent correcting, a build failure. The agent logs it to corrections.md immediately, before the next reply. Format: what went wrong, correct approach, source, pattern. No batching. No delays.
At the third occurrence of the same pattern, a promotion fires: the corrected understanding moves to memory.md (HOT tier, always loaded). From that point, every session the agent runs starts with that correction baked in. It cannot be forgotten.
When the mutation protocol activates (3 consecutive failures on the same approach), the agent declares a named strategy change in JOURNAL.md— M1 through M8 (pivot, decompose, escalate, simplify, reduce scope, rebuild, delegate, abandon) — and continues with the new strategy. Successful mutations propagate to clan-learnings/patterns.md during the sleep cycle.
How knowledge distributes — clan-learnings and CRDT gossip
An agent that learns something valuable shouldn't be the only one who knows it. The clan-learnings system propagates high-signal knowledge to every agent in the fleet, automatically, using a gossip protocol with CRDT merge semantics.
patterns.jsonl is the source of truth for cross-agent learnings — never edit it directly. Append a new entry, run render-patterns, and the generated patterns.md updates. Entries can supersede older ones ("supersedes": "old-id"). Every agent runsgrep -A5 <topic> ~/d/clan-learnings/patterns.md before starting work on any unfamiliar topic.
discoveries.jsonl handles non-obvious findings: undocumented API behaviors, workarounds that cost more than 15 minutes, performance findings. Published by any agent, consumed by all at session start (tail -20). Routine fixes and things already in patterns.md don't belong here.
CRDT semantics keep the distributed state consistent without coordination: OR-Set for adding patterns (concurrent adds always converge), LWW-Register for updates (last-write-wins with timestamp), G-Counter for occurrence counts (monotonically increasing, merge by max). The knowledge graph grows without conflict resolution overhead.
Deeper dives
- Reflection Loop — the full closed-loop self-improvement cycle, correction triggers, M1-M8 mutation protocol, and fleet propagation
- Knowledge Graph — entity extraction pipeline, 129 entities, 37 sidecars, structural traversal
- Cognitive Journals — JOURNAL.md lifecycle, per-worktree rule, mandatory handover reads
- Discovery Gossip & CRDT — discoveries.jsonl, patterns.jsonl, OR-Set/LWW-Register/G-Counter mechanics
- Research Loop — PROPOSAL.md gate, blind review protocol, hypothesis-before-architecture
- Skills Evolution — detection heuristics H1-H4, pipeline, 0.9 score threshold, 30-day monitoring
- GraphRAG — entity extraction, relationship indexing, and structural knowledge traversal
- Distributed Knowledge — gossip protocol, CRDT-based state sync, distributed knowledge at fleet scale
- Security Model — how knowledge isolation works across sandboxed agents
- Pipeline & CAE — how the reflection loop connects to the full delivery cycle