132 lines
4.8 KiB
Markdown

---
name: systematic-debugging
description: Four-phase debugging methodology. Always find root cause before attempting fixes.
source: obra/superpowers (pinned 2026-03-19)
---
# Systematic Debugging
## Overview
Random fixes waste time and create new bugs. Quick patches mask underlying issues.
**Core principle:** ALWAYS find root cause before attempting fixes. Symptom fixes are failure.
## The Iron Law
```
NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
```
If you haven't completed Phase 1, you cannot propose fixes.
## When to Use
Use for ANY technical issue:
- Test failures
- Bugs in production
- Unexpected behavior
- Performance problems
- Build failures
- Integration issues
**Use this ESPECIALLY when:**
- Under time pressure (emergencies make guessing tempting)
- "Just one quick fix" seems obvious
- You've already tried multiple fixes
- Previous fix didn't work
- You don't fully understand the issue
## The Four Phases
You MUST complete each phase before proceeding to the next.
### Phase 1: Root Cause Investigation
**BEFORE attempting ANY fix:**
1. **Read Error Messages Carefully**
- Don't skip past errors or warnings
- They often contain the exact solution
- Read stack traces completely
- Note line numbers, file paths, error codes
2. **Reproduce Consistently**
- Can you trigger it reliably?
- What are the exact steps?
- Does it happen every time?
- If not reproducible → gather more data, don't guess
3. **Check Recent Changes**
- What changed that could cause this?
- Git diff, recent commits
- New dependencies, config changes
- Environmental differences
4. **Gather Evidence in Multi-Component Systems**
For EACH component boundary:
- Log what data enters component
- Log what data exits component
- Verify environment/config propagation
- Check state at each layer
Run once to gather evidence showing WHERE it breaks.
5. **Trace Data Flow**
- Where does bad value originate?
- What called this with bad value?
- Keep tracing up until you find the source
- Fix at source, not at symptom
### Phase 2: Pattern Analysis
**Find the pattern before fixing:**
1. **Find Working Examples** — Locate similar working code in same codebase
2. **Compare Against References** — Read reference implementation COMPLETELY
3. **Identify Differences** — List every difference, however small
4. **Understand Dependencies** — Settings, config, environment, assumptions
### Phase 3: Hypothesis and Testing
**Scientific method:**
1. **Form Single Hypothesis** — "I think X is the root cause because Y"
2. **Test Minimally** — Make the SMALLEST possible change to test
3. **Verify Before Continuing** — Did it work? Yes → Phase 4. No → NEW hypothesis
4. **When You Don't Know** — Say "I don't understand X". Don't pretend.
### Phase 4: Implementation
**Fix the root cause, not the symptom:**
1. **Create Failing Test Case** — Simplest possible reproduction
2. **Implement Single Fix** — ONE change at a time, no "while I'm here" improvements
3. **Verify Fix** — Test passes? No other tests broken? Issue resolved?
4. **If Fix Doesn't Work** — If < 3 attempts: return to Phase 1. If 3: STOP and question the architecture
5. **If 3+ Fixes Failed: Question Architecture** Is this pattern fundamentally sound? Discuss with human before more fixes.
## Red Flags - STOP and Follow Process
- "Quick fix for now, investigate later"
- "Just try changing X and see if it works"
- "Add multiple changes, run tests"
- "It's probably X, let me fix that"
- "I don't fully understand but this might work"
- Each fix reveals new problem in different place
**ALL of these mean: STOP. Return to Phase 1.**
## Common Rationalizations
| Excuse | Reality |
|--------|---------|
| "Issue is simple, don't need process" | Simple issues have root causes too. Process is fast for simple bugs. |
| "Emergency, no time for process" | Systematic debugging is FASTER than guess-and-check thrashing. |
| "Just try this first" | First fix sets the pattern. Do it right from the start. |
| "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. |
| "I see the problem, let me fix it" | Seeing symptoms understanding root cause. |
## Quick Reference
| Phase | Key Activities | Success Criteria |
|-------|---------------|------------------|
| **1. Root Cause** | Read errors, reproduce, check changes, gather evidence | Understand WHAT and WHY |
| **2. Pattern** | Find working examples, compare | Identify differences |
| **3. Hypothesis** | Form theory, test minimally | Confirmed or new hypothesis |
| **4. Implementation** | Create test, fix, verify | Bug resolved, tests pass |
## Real-World Impact
- Systematic approach: 15-30 minutes to fix
- Random fixes approach: 2-3 hours of thrashing
- First-time fix rate: 95% vs 40%
- New bugs introduced: Near zero vs common