# Root Cause Tracing Systematically trace bugs backward through call stack to find original trigger. ## Core Principle **Trace backward through call chain until finding original trigger, then fix at source.** Bugs often manifest deep in call stack (git init in wrong directory, file created in wrong location). Instinct is to fix where error appears, but that's treating symptom. ## When to Use **Use when:** - Error happens deep in execution (not at entry point) - Stack trace shows long call chain - Unclear where invalid data originated - Need to find which test/code triggers problem ## The Tracing Process ### 1. Observe the Symptom ``` Error: git init failed in /Users/jesse/project/packages/core ``` ### 2. Find Immediate Cause What code directly causes this? ```typescript await execFileAsync('git', ['init'], { cwd: projectDir }); ``` ### 3. Ask: What Called This? ```typescript WorktreeManager.createSessionWorktree(projectDir, sessionId) → called by Session.initializeWorkspace() → called by Session.create() → called by test at Project.create() ``` ### 4. Keep Tracing Up What value was passed? - `projectDir = ''` (empty string!) - Empty string as `cwd` resolves to `process.cwd()` - That's the source code directory! ### 5. Find Original Trigger Where did empty string come from? ```typescript const context = setupCoreTest(); // Returns { tempDir: '' } Project.create('name', context.tempDir); // Accessed before beforeEach! ``` ## Adding Stack Traces When can't trace manually, add instrumentation: ```typescript async function gitInit(directory: string) { const stack = new Error().stack; console.error('DEBUG git init:', { directory, cwd: process.cwd(), stack, }); await execFileAsync('git', ['init'], { cwd: directory }); } ``` **Critical:** Use `console.error()` in tests (not logger - may not show) **Run and capture:** ```bash npm test 2>&1 | grep 'DEBUG git init' ``` **Analyze stack traces:** - Look for test file names - Find line number triggering call - Identify pattern (same test? same parameter?) ## Finding Which Test Causes Pollution If something appears during tests but don't know which test: Use bisection script: `scripts/find-polluter.sh` ```bash ./scripts/find-polluter.sh '.git' 'src/**/*.test.ts' ``` Runs tests one-by-one, stops at first polluter. ## Key Principle **NEVER fix just where error appears.** Trace back to find original trigger. When found immediate cause: - Can trace one level up? → Trace backwards - Is this the source? → Fix at source - Then add validation at each layer (see defense-in-depth.md) ## Real Example **Symptom:** `.git` created in `packages/core/` (source code) **Trace chain:** 1. `git init` runs in `process.cwd()` ← empty cwd parameter 2. WorktreeManager called with empty projectDir 3. Session.create() passed empty string 4. Test accessed `context.tempDir` before beforeEach 5. setupCoreTest() returns `{ tempDir: '' }` initially **Root cause:** Top-level variable initialization accessing empty value **Fix:** Made tempDir a getter that throws if accessed before beforeEach **Also added defense-in-depth:** - Layer 1: Project.create() validates directory - Layer 2: WorkspaceManager validates not empty - Layer 3: NODE_ENV guard refuses git init outside tmpdir - Layer 4: Stack trace logging before git init