wololo
Get access

Agent Intelligence

Agent Intelligence is how agents remember, recall, learn, and propagate understanding across an entire fleet — permanently. It is not a database you configure. It is a closed loop: agents write observations, index them automatically, search them at inference time, get corrected when they're wrong, and make those corrections available to every other agent in the clan. The loop runs continuously, without human intervention, and the fleet gets measurably smarter over time.

Knowledge & Memorysession loading · HOT/WARM/BANK/ARCHIVE tiersReflection Loopcorrections → promotion → M1-M8 mutationsKnowledge Graph129 entities · 37 sidecars · structural searchCognitive JournalsJOURNAL.md · per-worktree · handover readsGossip & CRDTdiscoveries.jsonl · patterns propagate fleet-wideResearch LoopPROPOSAL.md gate · blind review · hypothesis firstSkills EvolutionH1-H4 detection → build → 0.9 score → adoptGraphRAGentity extraction · relationship index · traversal

The seven subsystems work as a single closed loop. Click any section below to dive deeper.

The seven subsystems at a glance

SubsystemWhat it doesCore mechanic
Reflection LoopAgents learn from corrections and never repeat the same mistakecorrections.md → occurrence counter → HOT promotion → M1-M8 mutations
Knowledge GraphStructural memory — entities, relationships, code sidecars indexed and searchableentity extraction → bank/entities/ → graph index → GraphRAG traversal
Cognitive JournalsPer-session decision log for debugging, handovers, and mutation declarationsJOURNAL.md in every worktree → archive to clan-learnings on task done
Gossip & CRDTWhat one agent learns, all agents know — without a coordinatordiscoveries.jsonl → tail-20 at session start → CRDT merge → fleet-wide patterns
Research LoopGate on architectural decisions — hypothesis before buildingPROPOSAL.md → PoC → blind review → Splinter accept/reject → clan-learnings
Skills EvolutionRepeated manual steps get packaged into reusable skills automaticallyH1-H4 detection → SKILL-PROPOSAL.md → build → Velma 0.9 score gate → adopt
GraphRAGDeep structural search — traverses entity relationships, not just text similarityentity extraction → relationship indexing → graph traversal → structural results

Everything is built on plain Markdown files. The indexes are derived and rebuilt on demand. Lose the index? Rebuild it. Change the embedding model? Re-embed. The source of truth is always files in git — readable, editable, version-controlled.

What agents know and when — the loading sequence

Every session assembles its context in a specific order. What loads first is always loaded. What loads later is conditional. What's searched is on-demand. Understanding this sequence is the key to understanding why agents remember what they remember.

SESSION START — KNOWLEDGE LOADING SEQUENCEEvery session assembles context in this order — HOT first, searched lastHOT1WorkspaceFilesSOUL.mdAGENTS.mdIDENTITY.mdTOOLS.mdHEARTBEAT.mdHOT2MEMORY.mdHOT TierCuratedpermanentfactsCorrectionspromotionsWARM3Today'sDaily Logmemory/YYYY-MM-DD(today)SessioncontinuityWARM4Yesterday'sLogmemory/YYYY-MM-DD(yesterday)RecentcontextWARM5ClanLearningspatterns.mddiscoveries(tail -20)Cross-agentfindingsON-DEMAND6SearchPrimingHybrid queryon task topicVec+BM25MMR rankedASSEMBLED7AgentContextLLM contextwindow readywork beginsMEMORY TIERSHOTAlways in context· MEMORY.md — corrections promoted 3x· SOUL.md / AGENTS.md / IDENTITY.md· TOOLS.md / HEARTBEAT.md / USER.mdWhen: Every session, every turnSize: < 8k tokens targetWARMLoaded at session start· memory/YYYY-MM-DD.md (today)· memory/YYYY-MM-DD.md (yesterday)· clan-learnings/ tail (patterns, discoveries)When: Session start onlySize: 8–32k tokensBANKSearched on demand· memory/*.md (older logs)· bank/entities/ (entity pages)· session transcripts (indexed)When: Hybrid search querySize: Unbounded — chunkedARCHIVERarely accessed· self-improving/archive/· Compacted old sessions· Superseded patternsWhen: Explicit recall onlySize: Compressed chunks

HOT — always in context

Workspace files (SOUL.md, AGENTS.md, IDENTITY.md, TOOLS.md, HEARTBEAT.md) andMEMORY.md are injected into every session, every turn. These are the agent's permanent personality, standing orders, and curated long-term facts. Anything promoted to HOT tier stays there until explicitly archived. Target size: under 8k tokens.

WARM — loaded at session start

Today's and yesterday's daily logs (memory/YYYY-MM-DD.md) are read at session start for continuity — what happened recently, what was decided, what's in flight. The tail of clan-learnings/discoveries.jsonl (last 20 entries) is also loaded, giving the agent recent cross-fleet findings before any work begins.

BANK — searched on demand

Older logs, entity pages (bank/entities/), and session transcripts live in the BANK tier. They're not loaded automatically — they're retrieved by the search pipeline when a query matches. The hybrid retrieval system (see below) handles this transparently.

ARCHIVE — rarely accessed

Compacted old sessions, superseded patterns, and archived self-improvement entries live here. Accessible by explicit recall only.

How agents remember — write paths

Memory is append-only. Agents never overwrite — they append, date-stamp, and let the indexer pick up changes. Six write paths feed the knowledge system:

  • Daily log append — decisions, observations, anything worth remembering go into memory/YYYY-MM-DD.md throughout the session
  • Session compaction flush — when context nears limits, a memory flush writes a structured summary before the session compacts. Nothing is lost.
  • corrections.md write — every correction, immediately, before the next reply. No batching, no delays.
  • Entity page update — stable facts about people, projects, systems curated into bank/entities/
  • Gossip broadcast receive — incoming pattern/discovery entries from other agents via CRDT merge
  • Discovery publish — non-obvious findings (workarounds > 15min, undocumented behaviors) written to clan-learnings/discoveries.jsonl

How agents search — parallel retrieval

Search runs two parallel paths. Text retrieval (vector + BM25, or the QMD sidecar when active) and graph traversal (GraphRAG) run simultaneously. Results are merged, deduplicated, re-ranked by recency and diversity, and returned as cited context bundles — small, attributed snippets the agent can use and reference exactly.

KNOWLEDGE RETRIEVAL — TWO PARALLEL PATHSText retrieval + graph traversal run in parallel — results merged into cited context bundles💬QUERYnatural languageVECTOR SEARCHsqlite-vec · cosine similarity✓ Paraphrased recall "machine running gateway""Mac Studio"Model: GGUF / Ollama / API · offline ✓BM25 / FTS5SQLite · token matching · < 10ms✓ Exact tokens: IDs, errors, symbols "SHOTAPP-42", env vars, stack tracesNo model needed · offline ✓70/30OR — QMD SIDECARQMD UnifiedReplaces vector+BM25 when activeBM25 + vec + reranking in one pipelineBun runtime · auto GGUF download · offline ✓HYBRID MERGE· Score fusion· 70% vec + 30% BM25· Deduplication· Temporal decay· 30d half-life· MMR re-rankingGRAPHRAG129 entities · 131 relationships✓ Structural: "who worked on auth?" Entity walks · dependency chainsEntity extraction model · offline ✓structural resultsCONTEXTBUNDLECited snippetsSource pathsLine rangesScoresStructural factsEntity relationsCURRENT KNOWLEDGE GRAPH129 entities · 131 relationships · 283 patterns

Text retrieval — vector + BM25 (or QMD)

Vector and BM25 run simultaneously. Results merge at 70/30 weighting by default, then pass through temporal decay and MMR re-ranking. When the QMD sidecar is active, it replaces both — bundling BM25, vector search, and reranking into one local pipeline via Bun + node-llama-cpp.

BM25 / FTS5 (keyword)

SQLite FTS5 token matching. Under 10ms, no ML, no model required. Best for exact identifiers: ticket numbers, env var names, error strings, code symbols. Weak on paraphrased recall.

T2 — Vector search (semantic)

Chunks (~400 tokens) embedded with a configurable model (local GGUF, Ollama, OpenAI, Gemini, Voyage, Mistral). Cosine similarity matching. Best for paraphrased recall — "machine running the gateway" finds "Mac Studio gateway host." Weak on exact tokens.

T3 — QMD sidecar (hybrid + reranking)

BM25 + vector search + reranking in one local pipeline. Runs via Bun + node-llama-cpp, auto-downloads GGUF models, managed as a gateway subprocess. Best of both modes with a unified interface.

T4 — GraphRAG (structural)

Entity-aware traversal over a knowledge graph with 129 entities and 131 relationships. Answers structural queries: "Who worked on the auth system?", "What depends on payments?", "Show everything related to the migration." Complements vector and keyword — finds information that is structurally related, not just textually similar.

Post-processing: merge, decay, re-rank

All four result sets merge with score fusion and deduplication. Temporal decay applies an exponential recency boost (30-day half-life) — recent memories score higher. Evergreen files (MEMORY.md, entity pages) are never decayed. MMR re-ranking (Maximal Marginal Relevance) then ensures top results cover diverse aspects rather than repeating the same information.

How agents learn — the closed-loop reflection system

This is the core contribution: a self-improving loop that turns every correction into permanent fleet-wide knowledge. The loop is automatic. No human intervention required after the initial correction.

CLOSED-LOOP REFLECTION SYSTEMCorrect once, never again · 3rd occurrence → HOT tier · fleet-wide propagationCORRECTION TRIGGERSHuman says"no" / "wrong"Agent correctsanother agentBuild / testfailureMutationthreshold hit(3 failures)corrections.mdImmediate write## YYYY-MM-DD HH:MM· What I got wrong· Correct approach· Source of correction· Pattern (lesson)RULESNo batching — each entryis its own recordLogged BEFORE next replyNever skipped3rd OCCURRENCEPattern detector fires→ promotes to HOT3rd hitpromoteHOT TIERmemory.mdAlways loaded in contextEvery session, every turn· < 8k tokens target· Never paged out· Applied immediatelysleepcycleclan-learnings/Cross-agent library· patterns.jsonl· discoveries.jsonl· patterns.md (generated)· render-patterns scriptCRDT MERGEOR-Set (add patterns)LWW-Register (updates)G-Counter (occurrence count)gossipALL AGENTS🎯🐀📡🔍🔒🎸Pattern applied atsession startDISCOVERIES CHANNEL — discoveries.jsonlNon-obvious findings broadcast when: workaround cost > 15min · undocumented behavior · performance finding· id · type · agent · ts · context · problem · solution · tags · confidence· Published by: any agent finding something worth sharing· Consumed: tail -20 at session start → applied to current task· NOT published: routine fixes, things already in patterns.md, secretsMUTATION PROTOCOL: 3 failures → declare M1-M8 in JOURNAL.md → strategy switch → outcome to patterns.jsonldiscovery publish

The trigger is any correction — a human saying "no", another agent correcting, a build failure. The agent logs it to corrections.md immediately, before the next reply. Format: what went wrong, correct approach, source, pattern. No batching. No delays.

At the third occurrence of the same pattern, a promotion fires: the corrected understanding moves to memory.md (HOT tier, always loaded). From that point, every session the agent runs starts with that correction baked in. It cannot be forgotten.

When the mutation protocol activates (3 consecutive failures on the same approach), the agent declares a named strategy change in JOURNAL.md— M1 through M8 (pivot, decompose, escalate, simplify, reduce scope, rebuild, delegate, abandon) — and continues with the new strategy. Successful mutations propagate to clan-learnings/patterns.md during the sleep cycle.

How knowledge distributes — clan-learnings and CRDT gossip

An agent that learns something valuable shouldn't be the only one who knows it. The clan-learnings system propagates high-signal knowledge to every agent in the fleet, automatically, using a gossip protocol with CRDT merge semantics.

patterns.jsonl is the source of truth for cross-agent learnings — never edit it directly. Append a new entry, run render-patterns, and the generated patterns.md updates. Entries can supersede older ones ("supersedes": "old-id"). Every agent runsgrep -A5 <topic> ~/d/clan-learnings/patterns.md before starting work on any unfamiliar topic.

discoveries.jsonl handles non-obvious findings: undocumented API behaviors, workarounds that cost more than 15 minutes, performance findings. Published by any agent, consumed by all at session start (tail -20). Routine fixes and things already in patterns.md don't belong here.

CRDT semantics keep the distributed state consistent without coordination: OR-Set for adding patterns (concurrent adds always converge), LWW-Register for updates (last-write-wins with timestamp), G-Counter for occurrence counts (monotonically increasing, merge by max). The knowledge graph grows without conflict resolution overhead.

Deeper dives

  • Reflection Loop — the full closed-loop self-improvement cycle, correction triggers, M1-M8 mutation protocol, and fleet propagation
  • Knowledge Graph — entity extraction pipeline, 129 entities, 37 sidecars, structural traversal
  • Cognitive Journals — JOURNAL.md lifecycle, per-worktree rule, mandatory handover reads
  • Discovery Gossip & CRDT — discoveries.jsonl, patterns.jsonl, OR-Set/LWW-Register/G-Counter mechanics
  • Research Loop — PROPOSAL.md gate, blind review protocol, hypothesis-before-architecture
  • Skills Evolution — detection heuristics H1-H4, pipeline, 0.9 score threshold, 30-day monitoring
  • GraphRAG — entity extraction, relationship indexing, and structural knowledge traversal
  • Distributed Knowledge — gossip protocol, CRDT-based state sync, distributed knowledge at fleet scale
  • Security Model — how knowledge isolation works across sandboxed agents
  • Pipeline & CAE — how the reflection loop connects to the full delivery cycle