Chapter 6: Lessons Learned — Build Memory Into Your System

Something goes wrong. Maybe Claude took an action you didn’t approve. Maybe a deployment broke because of a step that wasn’t in the runbook. Maybe a decision that seemed obvious at the time turned out badly.

The default response is to fix it and move on. The lessons-learned skill is what you run instead. It guides Claude through a structured retrospective — not to assign blame, but to identify what system change would prevent this from happening again — and then implements that change before the session ends.

The Skill File

---
name: lessons-learned
description: Conduct structured retrospective analysis and encode fixes into the system. Use after incidents, mistakes, rollbacks, post-mortems, or when asking "what went wrong."
user-invocable: true
---

There’s no tools specification here because what the skill uses depends entirely on what the fix requires. Encoding a lesson might mean creating a new skill file, updating a CLAUDE.md, adding a checklist to an existing workflow, or writing a new guard. The skill guides the process; the output varies.

The Seven Phases

Phase 1: Incident definition. Before analysis, capture the facts. The skill provides a template:

## Incident Summary

**What happened:** [Factual description]
**When:** [Date/time]
**Impact:** [What was affected, scope of damage]
**Resolution:** [How it was fixed]
**Time to resolution:** [How long to fix]

The instruction is explicit: facts first, analysis later. Getting the facts wrong before you start analysing them means analysing the wrong problem.

Phase 2: Timeline reconstruction. Build a chronological sequence of what happened, who did what, and what resulted. Three key questions: what was the trigger? where did the sequence diverge from expected? what was the point of no return?

Phase 3: Root cause analysis. The skill uses the 5 Whys technique — asking why five times to get from symptom to root cause. Not “the deployment broke” but “the deployment broke because the runbook was missing a step because we never documented that dependency because…”

Phase 4: Contributing factors. Root cause isn’t usually enough. The skill categorises contributing factors: process gaps, communication failures, missing technical guards, context assumptions that turned out to be wrong.

Phase 5: Fix classification. This is where the skill distinguishes itself from a standard post-mortem. Every finding gets classified by fix type:

Fix typeWhen to useHow to encode
SkillRecurring workflow needs structureCreate SKILL.md in ~/.claude/skills/
GuardAction requires mandatory checkpointAdd approval gate to relevant skill
DocumentationKnowledge gap caused the issueUpdate CLAUDE.md or relevant docs
AutomationManual step was forgottenCreate hook or script
ChecklistMultiple steps need verificationAdd to existing skill or create new one

Phase 6: Fix implementation. The skill’s most important instruction: “Don’t just recommend fixes — implement them.” A retrospective that ends with a list of recommendations is a document that will be forgotten. The skill drives Claude to make the actual change — create the file, update the workflow, add the guard — before the session ends.

Phase 7: Verification. Define how you’ll know the fix worked. Test scenario, success criteria, review date.

The Common Patterns

The skill includes a library of recurring incident patterns with their standard fixes:

Premature action — Claude took action before explicit approval. Fix: add an approval gate to the relevant skill. The template:

## Approval Gate Template

Before [ACTION]:
1. Show user exactly what will happen
2. Ask: "Ready to [action]? (yes/no)"
3. Wait for explicit "yes" or "proceed"
4. Only then execute

Sequence error — steps were executed in wrong order. Fix: encode the sequence in the skill with numbered steps and explicit dependency chain.

Missing validation — bad data passed through without a check. Fix: add validation step to skill or create a pre-flight check.

Context carryover — assumptions from a prior session caused the issue. Fix: add explicit context verification at task start.

Scope creep — Claude did more than requested and caused side effects. Fix: add clarifying questions before expanding scope.

The Anti-Patterns Section

The skill explicitly calls out what not to do:

Anti-patternProblemInstead
Blame assignmentCreates defensiveness, misses systemic issuesFocus on process, not people
Single-cause thinkingOversimplifies, misses contributing factorsUse 5 Whys, identify multiple factors
Recommendation without actionLessons forgotten, issue recursImplement fixes during retrospective
Vague fixes”Be more careful” doesn’t prevent recurrenceEncode specific, verifiable changes
Skip verificationNo way to know if fix workedDefine success criteria and review date

“Be more careful” is not a fix. A guard in the skill file is a fix. The distinction between the two is what separates a retrospective that produces change from one that produces a document.

The Feedback Loop

After the retrospective, the skill instructs Claude to log findings to the daily note via /log-to-daily. If a skill was created or modified, the skill itself is the durable artifact. For high-severity incidents, the fix gets added to an incident log.

The pattern is: incident → retrospective → encoded fix → future incident doesn’t happen. The skill encodes institutional memory directly into the system rather than into a document that no one reads.

How to Customise It

The fix classification table — my workflow has specific places for each fix type. Update the locations to match your own skill directory, CLAUDE.md paths, and hook locations.

Common incident patterns — add patterns specific to your workflow. If you have a recurring issue with a particular tool or process, encode its standard fix in the skill’s pattern library.

The verification criteria — if you want to be more or less rigorous about what counts as a verified fix, adjust the Phase 7 requirements. For critical incidents, you might want a mandatory review date and a test scenario. For minor ones, the fix itself might be sufficient.

Integration with your notes system — the skill currently integrates with Obsidian for the daily note log and the captures file. If you use a different note-taking system, update the log step.

Installing It

mkdir -p ~/.claude/skills/lessons-learned
# Copy the SKILL.md from github.com/aplaceforallmystuff

Invoke with /lessons-learned after any incident, mistake, or near-miss worth analysing.


That’s all five skills. They’re all published at github.com/aplaceforallmystuff — free to copy, modify, and make your own. The whole point is that your version should diverge from mine as you tune each one to your actual workflow.

If you want to go deeper on building skills from scratch — rather than adapting existing ones — the Building AI Agents course covers the full process: from first skill to multi-agent pipelines.

Check Your Understanding

Answer all questions correctly to complete this module.

1. What makes lessons-learned different from a standard post-mortem?

2. Why is 'Be more careful' not a fix?

3. What is the purpose of the 5 Whys technique?