Agent Intelligence

Agent Intelligence is how agents remember, recall, learn, and propagate understanding across an entire fleet — permanently. It is not a database you configure. It is a closed loop: agents write observations, index them automatically, search them at inference time, get corrected when they're wrong, and make those corrections available to every other agent in the clan. The loop runs continuously, without human intervention, and the fleet gets measurably smarter over time.

The seven subsystems work as a single closed loop. Click any section below to dive deeper.

The seven subsystems at a glance

Subsystem	What it does	Core mechanic
Reflection Loop	Agents learn from corrections and never repeat the same mistake	corrections.md → occurrence counter → HOT promotion → M1-M8 mutations
Knowledge Graph	Structural memory — entities, relationships, code sidecars indexed and searchable	entity extraction → bank/entities/ → graph index → GraphRAG traversal
Cognitive Journals	Per-session decision log for debugging, handovers, and mutation declarations	JOURNAL.md in every worktree → archive to clan-learnings on task done
Gossip & CRDT	What one agent learns, all agents know — without a coordinator	discoveries.jsonl → tail-20 at session start → CRDT merge → fleet-wide patterns
Research Loop	Gate on architectural decisions — hypothesis before building	PROPOSAL.md → PoC → blind review → Splinter accept/reject → clan-learnings
Skills Evolution	Repeated manual steps get packaged into reusable skills automatically	H1-H4 detection → SKILL-PROPOSAL.md → build → Velma 0.9 score gate → adopt
GraphRAG	Deep structural search — traverses entity relationships, not just text similarity	entity extraction → relationship indexing → graph traversal → structural results

Everything is built on plain Markdown files. The indexes are derived and rebuilt on demand. Lose the index? Rebuild it. Change the embedding model? Re-embed. The source of truth is always files in git — readable, editable, version-controlled.

What agents know and when — the loading sequence

Every session assembles its context in a specific order. What loads first is always loaded. What loads later is conditional. What's searched is on-demand. Understanding this sequence is the key to understanding why agents remember what they remember.

HOT — always in context

Workspace files (SOUL.md, AGENTS.md, IDENTITY.md, TOOLS.md, HEARTBEAT.md) andMEMORY.md are injected into every session, every turn. These are the agent's permanent personality, standing orders, and curated long-term facts. Anything promoted to HOT tier stays there until explicitly archived. Target size: under 8k tokens.

WARM — loaded at session start

Today's and yesterday's daily logs (memory/YYYY-MM-DD.md) are read at session start for continuity — what happened recently, what was decided, what's in flight. The tail of clan-learnings/discoveries.jsonl (last 20 entries) is also loaded, giving the agent recent cross-fleet findings before any work begins.

BANK — searched on demand

Older logs, entity pages (bank/entities/), and session transcripts live in the BANK tier. They're not loaded automatically — they're retrieved by the search pipeline when a query matches. The hybrid retrieval system (see below) handles this transparently.

ARCHIVE — rarely accessed

Compacted old sessions, superseded patterns, and archived self-improvement entries live here. Accessible by explicit recall only.

How agents remember — write paths

Memory is append-only. Agents never overwrite — they append, date-stamp, and let the indexer pick up changes. Six write paths feed the knowledge system:

Daily log append — decisions, observations, anything worth remembering go into memory/YYYY-MM-DD.md throughout the session
Session compaction flush — when context nears limits, a memory flush writes a structured summary before the session compacts. Nothing is lost.
corrections.md write — every correction, immediately, before the next reply. No batching, no delays.
Entity page update — stable facts about people, projects, systems curated into bank/entities/
Gossip broadcast receive — incoming pattern/discovery entries from other agents via CRDT merge
Discovery publish — non-obvious findings (workarounds > 15min, undocumented behaviors) written to clan-learnings/discoveries.jsonl

How agents search — parallel retrieval

Search runs two parallel paths. Text retrieval (vector + BM25, or the QMD sidecar when active) and graph traversal (GraphRAG) run simultaneously. Results are merged, deduplicated, re-ranked by recency and diversity, and returned as cited context bundles — small, attributed snippets the agent can use and reference exactly.

Text retrieval — vector + BM25 (or QMD)

Vector and BM25 run simultaneously. Results merge at 70/30 weighting by default, then pass through temporal decay and MMR re-ranking. When the QMD sidecar is active, it replaces both — bundling BM25, vector search, and reranking into one local pipeline via Bun + node-llama-cpp.

BM25 / FTS5 (keyword)

SQLite FTS5 token matching. Under 10ms, no ML, no model required. Best for exact identifiers: ticket numbers, env var names, error strings, code symbols. Weak on paraphrased recall.

T2 — Vector search (semantic)

Chunks (~400 tokens) embedded with a configurable model (local GGUF, Ollama, OpenAI, Gemini, Voyage, Mistral). Cosine similarity matching. Best for paraphrased recall — "machine running the gateway" finds "Mac Studio gateway host." Weak on exact tokens.

T3 — QMD sidecar (hybrid + reranking)

BM25 + vector search + reranking in one local pipeline. Runs via Bun + node-llama-cpp, auto-downloads GGUF models, managed as a gateway subprocess. Best of both modes with a unified interface.

T4 — GraphRAG (structural)

Entity-aware traversal over a knowledge graph with 129 entities and 131 relationships. Answers structural queries: "Who worked on the auth system?", "What depends on payments?", "Show everything related to the migration." Complements vector and keyword — finds information that is structurally related, not just textually similar.

Post-processing: merge, decay, re-rank

All four result sets merge with score fusion and deduplication. Temporal decay applies an exponential recency boost (30-day half-life) — recent memories score higher. Evergreen files (MEMORY.md, entity pages) are never decayed. MMR re-ranking (Maximal Marginal Relevance) then ensures top results cover diverse aspects rather than repeating the same information.

How agents learn — the closed-loop reflection system

This is the core contribution: a self-improving loop that turns every correction into permanent fleet-wide knowledge. The loop is automatic. No human intervention required after the initial correction.

The trigger is any correction — a human saying "no", another agent correcting, a build failure. The agent logs it to corrections.md immediately, before the next reply. Format: what went wrong, correct approach, source, pattern. No batching. No delays.

At the third occurrence of the same pattern, a promotion fires: the corrected understanding moves to memory.md (HOT tier, always loaded). From that point, every session the agent runs starts with that correction baked in. It cannot be forgotten.

When the mutation protocol activates (3 consecutive failures on the same approach), the agent declares a named strategy change in JOURNAL.md— M1 through M8 (pivot, decompose, escalate, simplify, reduce scope, rebuild, delegate, abandon) — and continues with the new strategy. Successful mutations propagate to clan-learnings/patterns.md during the sleep cycle.

How knowledge distributes — clan-learnings and CRDT gossip

An agent that learns something valuable shouldn't be the only one who knows it. The clan-learnings system propagates high-signal knowledge to every agent in the fleet, automatically, using a gossip protocol with CRDT merge semantics.

patterns.jsonl is the source of truth for cross-agent learnings — never edit it directly. Append a new entry, run render-patterns, and the generated patterns.md updates. Entries can supersede older ones ("supersedes": "old-id"). Every agent runsgrep -A5 <topic> ~/d/clan-learnings/patterns.md before starting work on any unfamiliar topic.

discoveries.jsonl handles non-obvious findings: undocumented API behaviors, workarounds that cost more than 15 minutes, performance findings. Published by any agent, consumed by all at session start (tail -20). Routine fixes and things already in patterns.md don't belong here.

CRDT semantics keep the distributed state consistent without coordination: OR-Set for adding patterns (concurrent adds always converge), LWW-Register for updates (last-write-wins with timestamp), G-Counter for occurrence counts (monotonically increasing, merge by max). The knowledge graph grows without conflict resolution overhead.

Deeper dives

Reflection Loop — the full closed-loop self-improvement cycle, correction triggers, M1-M8 mutation protocol, and fleet propagation
Knowledge Graph — entity extraction pipeline, 129 entities, 37 sidecars, structural traversal
Cognitive Journals — JOURNAL.md lifecycle, per-worktree rule, mandatory handover reads
Discovery Gossip & CRDT — discoveries.jsonl, patterns.jsonl, OR-Set/LWW-Register/G-Counter mechanics
Research Loop — PROPOSAL.md gate, blind review protocol, hypothesis-before-architecture
Skills Evolution — detection heuristics H1-H4, pipeline, 0.9 score threshold, 30-day monitoring
GraphRAG — entity extraction, relationship indexing, and structural knowledge traversal
Distributed Knowledge — gossip protocol, CRDT-based state sync, distributed knowledge at fleet scale
Security Model — how knowledge isolation works across sandboxed agents
Pipeline & CAE — how the reflection loop connects to the full delivery cycle