Files
english/.opencode/skills/ck-debug/references/systematic-debugging.md
2026-04-12 01:06:31 +07:00

103 lines
4.1 KiB
Markdown

# Systematic Debugging
Four-phase debugging framework that ensures root cause investigation before attempting fixes.
## The Iron Law
```
NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
```
If haven't completed Phase 1, cannot propose fixes.
## The Four Phases
Must complete each phase before proceeding to next.
### Phase 1: Root Cause Investigation
**BEFORE attempting ANY fix:**
1. **Read Error Messages Carefully** - Don't skip past errors/warnings, read stack traces completely
2. **Reproduce Consistently** - Can trigger reliably? Exact steps? If not reproducible → gather more data
3. **Check Recent Changes** - What changed? Git diff, recent commits, new dependencies, config changes
4. **Gather Evidence in Multi-Component Systems**
- For EACH component boundary: log data entering/exiting, verify environment propagation
- Run once to gather evidence showing WHERE it breaks
- THEN analyze to identify failing component
5. **Trace Data Flow** - Where does bad value originate? Trace up call stack until finding source (see root-cause-tracing.md)
### Phase 2: Pattern Analysis
**Find pattern before fixing:**
1. **Find Working Examples** - Locate similar working code in same codebase
2. **Compare Against References** - Read reference implementation COMPLETELY, understand fully before applying
3. **Identify Differences** - List every difference however small, don't assume "that can't matter"
4. **Understand Dependencies** - What other components, settings, config, environment needed?
### Phase 3: Hypothesis and Testing
**Scientific method:**
1. **Form Single Hypothesis** - "I think X is root cause because Y", be specific not vague
2. **Test Minimally** - SMALLEST possible change to test hypothesis, one variable at a time
3. **Verify Before Continuing** - Worked? → Phase 4. Didn't work? → NEW hypothesis. DON'T add more fixes
4. **When Don't Know** - Say "I don't understand X", don't pretend, ask for help
### Phase 4: Implementation
**Fix root cause, not symptom:**
1. **Create Failing Test Case** - Simplest reproduction, automated if possible, MUST have before fixing
2. **Implement Single Fix** - Address root cause identified, ONE change, no "while I'm here" improvements
3. **Verify Fix** - Test passes? No other tests broken? Issue actually resolved?
4. **If Fix Doesn't Work**
- STOP. Count: How many fixes tried?
- If < 3: Return to Phase 1, re-analyze with new information
- **If 3: STOP and question architecture**
5. **If 3+ Fixes Failed: Question Architecture**
- Pattern: Each fix reveals new shared state/coupling problem elsewhere
- STOP and question fundamentals: Is pattern sound? Wrong architecture?
- Discuss with human partner before more fixes
## Red Flags - STOP and Follow Process
If catch yourself thinking:
- "Quick fix for now, investigate later"
- "Just try changing X and see if it works"
- "Add multiple changes, run tests"
- "Skip the test, I'll manually verify"
- "It's probably X, let me fix that"
- "I don't fully understand but this might work"
- "One more fix attempt" (when already tried 2+)
**ALL mean:** STOP. Return to Phase 1.
## Human Partner Signals You're Doing It Wrong
- "Is that not happening?" - Assumed without verifying
- "Will it show us...?" - Should have added evidence gathering
- "Stop guessing" - Proposing fixes without understanding
- "Ultrathink this" - Question fundamentals, not just symptoms
- "We're stuck?" (frustrated) - Approach isn't working
**When see these:** STOP. Return to Phase 1.
## Common Rationalizations
| Excuse | Reality |
|--------|---------|
| "Issue is simple, don't need process" | Simple issues have root causes too |
| "Emergency, no time for process" | Systematic is FASTER than guess-and-check |
| "Just try this first, then investigate" | First fix sets pattern. Do right from start |
| "One more fix attempt" (after 2+ failures) | 3+ failures = architectural problem |
## Real-World Impact
From debugging sessions:
- Systematic approach: 15-30 minutes to fix
- Random fixes approach: 2-3 hours of thrashing
- First-time fix rate: 95% vs 40%
- New bugs introduced: Near zero vs common