Skills Evolution
What is it? (Start here)
Think about a factory worker who notices she performs the same 10-step sequence every single morning to start the machines. She mentions it to the engineer. The engineer watches her do it twice, confirms it's always the same steps, and builds a button that does all 10 steps in one press. Next week, every worker in the factory has the button.
Skills Evolution is that process for agents. When any agent notices they're doing the same sequence of steps repeatedly — same commands, same order, same context — the system turns it into a reusable "skill": a documented, tested, shareable workflow that any agent can invoke. The agent who spotted the pattern proposes it. A reviewer approves it. A builder builds it. A validator confirms it actually works. Then all seven agents get it.
A real example
Cantona notices he's been setting up git worktrees three times this week — always the same sequence: git fetch, git worktree add, handle transcrypt credentials, verify the setup. He writes in his journal: "I wish I had a tool for creating worktrees with transcrypt already configured." That's an H2 signal (explicit "I wish I had a tool for..." in JOURNAL.md).
He creates SKILL-PROPOSAL.md and posts to Discord. Splinter reviews it and confirms the automation fits the system. Cantona builds a SKILL.md workflow. Velma runs it from a clean environment — no builder guidance — and confirms it works. Popashot broadcasts it. Now all seven agents have a one-line worktree setup command instead of five manual steps.
How it works — the pipeline
Step 1: Detection (any agent, any session)
Any agent can detect a skill candidate. Four signals trigger a proposal:
| Signal | What triggers it | Example |
|---|---|---|
| H1 | 3+ similar shell command sequences appear in journals | Same git worktree + transcrypt setup commands across 3 different tasks |
| H2 | "I wish I had a tool for..." in JOURNAL.md | Explicit gap named in writing — highest-confidence signal |
| H3 | Same correction in corrections.md 3+ times, same root cause | Always forgetting to flush transcrypt credentials — recurring failure mode |
| H4 | Manual process >5 min, repeated 3+ times | Setting up D2 diagram tooling every time it's needed in a new worktree |
Detection is ambient — agents aren't running a scanner. They notice patterns during normal work and write them down.
Step 2: Propose (before building anything)
Create SKILL-PROPOSAL.md in the current worktree using the template at ~/d/clan-learnings/templates/SKILL-PROPOSAL.md, then post to Discord with an inbox item tagging Splinter.
Critical rule: do not build the skill until Splinter approves. Building first wastes time on a skill that may duplicate an existing one, use the wrong abstraction level, or have composability issues that change the implementation significantly.
Step 3: Architecture review (Splinter)
Splinter checks: does this fit the system? Is it at the right abstraction level? Does it duplicate an existing skill? Could it be a shell alias instead of a full SKILL.md? The review produces an accept or reject with rationale.
Step 4: Build (Cantona) + two binary gates
Cantona builds approved skills. Before validation, two gates must pass:
- Runs clean — executes without errors on a fresh environment
- Works for a second agent — a different agent uses it successfully without help from the builder
The second gate catches implicit knowledge. A builder knows all the context — they may skip documentation that seems obvious to them. A second agent using it blind finds every gap.
Step 5: Velma validates (score ≥ 0.9)
Skills are scored on two dimensions:
- Correctness = tests passed ÷ tests total
- Utility = times used in 30 days ÷ agents that loaded it
- Score = correctness × utility
| Score range | Outcome |
|---|---|
| ≥ 0.9 | Adopt and keep |
| 0.5 – 0.9 | Improve before adoption |
| < 0.5 | Archive — not worth maintaining |
Step 6: Adoption (Popashot) and 30-day monitoring (Tank)
Popashot broadcasts approved skills to all agents — updating AGENTS.md and announcing in #general. All seven agents can use the skill immediately.
Tank monitors usage and recalculates the score at day 30. Skills that scored well at launch but have low adoption get reviewed. Low-adoption skills that aren't actually useful get archived without ceremony.
How it connects to the larger loop
- H3 detection comes from Reflection Loop HOT-tier corrections — the same mistake 3+ times is both a correction signal and a skill signal
- H1 and H2 surface from Cognitive Journals — entries during active work that reveal repeated patterns
- Adopted skills and their usage patterns propagate via Gossip to clan-learnings — other agents learn what's available and what works