init

2026-04-12 01:06:31 +07:00
commit 10d660cbcb
1066 changed files with 228596 additions and 0 deletions
--- a/.opencode/skills/code-review/references/adversarial-review.md
+++ b/.opencode/skills/code-review/references/adversarial-review.md
@@ -0,0 +1,223 @@
+---
+name: adversarial-review
+description: Stage 3 red-team review that actively tries to break code — finds security holes, false assumptions, failure modes, race conditions. Spawns adversarial reviewer subagent with destructive mindset. Includes scope gate for trivial changes.
+---
+
+# Adversarial Review (Stage 3)
+
+Runs after every Stage 2 (Code Quality) pass. Subject to scope gate below.
+
+## Scope Gate
+
+Skip adversarial review when ALL of these are true:
+- Changed files <= 2
+- Lines changed <= 30
+- No security-sensitive files touched (auth, crypto, input parsing, SQL, env)
+- No new dependencies added
+
+When skipped, note: `Adversarial: skipped (below threshold)` in review output.
+
+**NEVER skip when:**
+- Any file in: `auth/`, `middleware/`, `security/`, `crypto/`
+- `package.json`, `package-lock.json`, or lockfile changed
+- Environment variables added/changed
+- Database schema modified
+- API route added/changed
+
+## Mindset
+
+> "You are hired to tear apart the implementer's work. Your job is to find every way this code can fail, be exploited, or produce incorrect results. Assume the implementer made mistakes. Prove it."
+
+This is NOT a standard code review. Standard reviews check if code meets requirements. Adversarial review assumes requirements are met and asks: **"How can this still break?"**
+
+## What to Attack
+
+### Security Holes
+- Injection vectors (SQL, command, XSS, template)
+- Auth bypass paths (missing checks, privilege escalation)
+- Secrets exposure (logs, error messages, stack traces)
+- Input trust boundaries (user input treated as safe)
+- SSRF, path traversal, deserialization attacks
+
+### False Assumptions
+- "This will never be null" -- prove it can be
+- "This list always has elements" -- find the empty case
+- "Users always call A before B" -- find the out-of-order path
+- "This config value exists" -- find the missing env var
+- "This third-party API always returns 200" -- find the failure mode
+- "This API shape won't change" -- find the breaking caller
+
+### Failure Modes & Resource Exhaustion
+- What happens when disk is full?
+- What happens when network times out mid-operation?
+- What happens when the database connection drops during a transaction?
+- Unbounded allocations from user-controlled input
+- Missing timeouts on external calls
+- Event loop blocking (sync operations in async context)
+- Connection/handle leaks on error paths
+- Regex catastrophic backtracking (ReDoS)
+
+### Race Conditions
+- Shared mutable state without locks
+- Time-of-check-to-time-of-use (TOCTOU)
+- Async operations with implicit ordering assumptions
+- Cache invalidation during concurrent writes
+
+### Data Corruption
+- Partial writes on failure (no transaction/rollback)
+- Type coercion surprises (string "0" as falsy)
+- Floating point comparison for equality
+- Timezone-naive datetime operations
+
+### Supply Chain & Dependencies
+- New dependencies: postinstall scripts, maintainer reputation, bundle size
+- Lockfile changes: version drift, removed integrity hashes
+- Transitive deps pulling in known-vulnerable packages
+
+### Observability Blind Spots
+- Swallowed errors (`catch {}` with no log)
+- Missing structured context in error logs
+- PII in log output
+
+## Process
+
+### 1. Spawn Adversarial Reviewer
+
+Dispatch `code-reviewer` subagent with adversarial prompt:
+
+```
+You are an adversarial code reviewer. Your ONLY job is to find ways this code
+can fail, be exploited, or produce incorrect results.
+
+DO NOT praise the code. DO NOT note what works well.
+ONLY report problems. If you find nothing, say "No findings" -- but try harder first.
+
+Focus on ADDED/MODIFIED lines (+ prefix in diff). Pre-existing code is out of scope
+unless the change makes it newly exploitable.
+
+Context (read for understanding, DO NOT review):
+{CONTEXT_FILES}
+
+Runtime: {RUNTIME} (e.g., Node.js single-threaded, browser, serverless)
+Framework: {FRAMEWORK} (e.g., Express with global error handler at app.ts:45)
+
+Review this diff:
+{DIFF}
+
+Changed files: {FILES}
+
+Attack vectors to check:
+1. Security holes (injection, auth bypass, secrets exposure)
+2. False assumptions (null, empty, ordering, config, API contracts)
+3. Failure modes + resource exhaustion (timeouts, leaks, unbounded input)
+4. Race conditions (shared state, TOCTOU, async ordering)
+5. Data corruption (partial writes, type coercion, encoding)
+6. Supply chain (new deps, lockfile changes, transitive vulns)
+7. Observability (swallowed errors, missing logs, PII in output)
+
+For each finding, report:
+- SEVERITY: Critical / Medium / Low
+- CATEGORY: Security / Assumption / Failure / Race / Data / Supply / Observability
+- LOCATION: file:line
+- ATTACK: How to trigger the problem
+- IMPACT: What happens when triggered
+- FIX: Describe the fix approach (e.g., "add null check before line 42").
+  Do NOT write implementation code -- the implementer has full context.
+```
+
+**If adversarial produces >10 findings on <100 lines changed:** likely too aggressive. Batch-reject noise, deep-review only Critical/Medium.
+
+### 2. Adjudicate Findings
+
+Main agent reviews each adversarial finding and assigns verdict:
+
+| Verdict | Meaning | Action |
+|---------|---------|--------|
+| **Accept** | Valid flaw, reproducible or clearly reasoned | Must fix before merge |
+| **Reject** | False positive, already handled, or impossible path | Document why, no action |
+| **Defer** | Valid but low-risk, tracked for later | Create GitHub issue for tracking |
+
+**Rules:**
+- Every finding gets a verdict -- no silent dismissals
+- Critical findings: Accept unless you can PROVE false positive
+- Benefit of doubt goes to the adversary (safer to fix than to dismiss)
+- If >50% of findings are Rejected, the adversary was too aggressive -- but still report all
+
+**Calibration examples:**
+
+| Verdict | Example | Reasoning |
+|---------|---------|-----------|
+| Accept | "SQL injection via string interpolation in query builder" | Clearly exploitable, concrete path shown |
+| Reject | "Missing null check on config.apiUrl" | Config loaded at startup with schema validation (see config.ts:12), cannot be null at runtime |
+| Defer | "No rate limiting on POST /api/upload" | Valid concern but internal-only tool currently; track for public exposure |
+
+### 3. Report Format
+
+```
+## Adversarial Review -- Stage 3
+
+### Summary
+- Findings: N total (X Critical, Y Medium, Z Low)
+- Accepted: A (must fix)
+- Rejected: B (false positive)
+- Deferred: C (tracked via GitHub issues)
+
+### Accepted Findings (Must Fix)
+
+#### [1] SEVERITY -- CATEGORY -- file:line
+**Attack:** How to trigger
+**Impact:** What happens
+**Fix:** Approach description
+**Verdict:** Accept -- [reason]
+
+### Rejected Findings
+
+#### [N] SEVERITY -- CATEGORY -- file:line
+**Attack:** Claimed vector
+**Verdict:** Reject -- [reason this is a false positive]
+
+### Deferred Findings
+
+#### [N] SEVERITY -- CATEGORY -- file:line
+**Attack:** How to trigger
+**Verdict:** Defer -- [reason] → GitHub issue #X
+```
+
+### 4. Fix Accepted Findings
+
+- Critical: Block merge. Fix immediately via `/fix` or manual edit.
+- Medium: Fix before merge if feasible. Defer only with explicit user approval.
+- Low: Track. Fix in follow-up if pattern repeats.
+
+### Re-review Optimization
+
+On fix cycles (re-running after accepted findings were fixed):
+- Only pass the FIX diff to adversarial, not the full original diff
+- Verify accepted findings are resolved
+- Check for regression: did the fix introduce new issues?
+
+## Integration with Pipeline
+
+```
+Stage 1 (Spec) → PASS
+  ↓
+Stage 2 (Quality) → PASS
+  ↓
+Scope gate → below threshold? → skip (note in report)
+  ↓ (above threshold)
+Stage 3 (Adversarial) → findings
+  ├─ 0 Accepted → PASS → proceed
+  ├─ Accepted Critical → BLOCK → fix → re-run Stage 3 (fix diff only)
+  └─ Accepted Medium/Low only → fix or defer → proceed
+```
+
+**Task pipeline update:** When using task-managed reviews, adversarial review gets its own task between "Review implementation" and "Fix critical issues".
+
+## What This Is NOT
+
+- NOT a style review (Stage 2 handles that)
+- NOT a spec compliance check (Stage 1 handles that)
+- NOT dependency graph analysis or import tracing (scout handles that)
+- NOT a general "suggestions for improvement" pass
+
+This is a focused, hostile attempt to break the code. If the code survives, it's ready to ship.
--- a/.opencode/skills/code-review/references/checklist-workflow.md
+++ b/.opencode/skills/code-review/references/checklist-workflow.md
@@ -0,0 +1,100 @@
+# Checklist-Based Review Workflow
+
+How to apply structured review checklists during code review.
+
+## When to Use
+
+- Pre-landing review (from `/ck:ship` pipeline)
+- Explicit request for checklist review
+- Security audit before release
+- Code-reviewer agent when reviewing significant changes (10+ files or security-sensitive)
+
+## Workflow
+
+### 1. Auto-Detect Project Type
+
+```bash
+# Check for web app frameworks
+if grep -qE '"(react|vue|svelte|next|nuxt|angular)"' package.json 2>/dev/null; then
+  echo "web-app"
+# Check for API patterns
+elif ls src/routes/ src/api/ src/controllers/ app/controllers/ 2>/dev/null | head -1; then
+  echo "api"
+else
+  echo "base-only"
+fi
+```
+
+### 2. Load Checklists
+
+Always load: `checklists/base.md`
+
+Overlay based on detection:
+- `web-app` → also load `checklists/web-app.md`
+- `api` → also load `checklists/api.md`
+- Both detected → load both overlays
+
+### 3. Get the Diff
+
+```bash
+git fetch origin main --quiet
+git diff origin/main
+```
+
+**CRITICAL:** Read the FULL diff before flagging anything. Checklist suppressions require full context.
+
+### 4. Two-Pass Review
+
+**Pass 1 (CRITICAL) — Run first:**
+- Scan diff against ALL critical categories (base + overlays)
+- Each finding must include: `[file:line]`, problem, fix
+- These block `/ship` pipeline
+
+**Pass 2 (INFORMATIONAL) — Run second:**
+- Scan diff against ALL informational categories (base + overlays)
+- Same format: `[file:line]`, problem, fix
+- Included in PR body but don't block
+
+### 5. Check Suppressions
+
+Before reporting any finding, verify it's NOT in the suppressions list (bottom of `base.md`).
+
+Key suppressions:
+- Already addressed in the diff
+- Readability-aiding redundancy
+- Style/formatting issues
+- "Consider using X" when Y works fine
+
+### 6. Output
+
+```
+Pre-Landing Review: N issues (X critical, Y informational)
+
+**CRITICAL** (blocking):
+- [src/auth/login.ts:42] SQL injection via string interpolation in user lookup
+  Fix: Use parameterized query: `db.query('SELECT * FROM users WHERE email = $1', [email])`
+
+**Issues** (non-blocking):
+- [src/api/users.ts:88] Magic number 30 for pagination limit
+  Fix: Extract to constant `DEFAULT_PAGE_SIZE = 30`
+```
+
+### 7. Critical Issue Resolution
+
+For each critical issue, use `AskUserQuestion`:
+- Problem with `file:line`
+- Recommended fix
+- Options:
+  - A) Fix now (recommended)
+  - B) Acknowledge and proceed
+  - C) False positive — skip
+
+If user chose A (fix): apply fixes, commit, then re-run tests before continuing.
+
+## Integration with /ck:ship
+
+The ship pipeline calls this workflow at Step 4. Critical findings block the pipeline. Informational findings are included in the PR body.
+
+## Integration with /ck:code-review
+
+When invoked as part of standard code review, the checklist augments (not replaces) the existing scout → review → fix → verify pipeline. Checklist findings are merged with code-reviewer's own findings.
--- a/.opencode/skills/code-review/references/checklists/api.md
+++ b/.opencode/skills/code-review/references/checklists/api.md
@@ -0,0 +1,52 @@
+# API Review Checklist (Overlay)
+
+Additive to `base.md`. Apply when project exposes REST/GraphQL/gRPC APIs.
+
+## Detection
+
+Apply this overlay when any of these are true:
+- Project has route definitions (Express, FastAPI, NestJS, Django, Rails, Go chi/gin)
+- OpenAPI/Swagger spec file exists
+- `src/routes/`, `src/api/`, `src/controllers/` directories
+- GraphQL schema files in the diff
+
+---
+
+## Pass 1 — CRITICAL (additions to base)
+
+### Auth & Rate Limiting
+- Public endpoints missing rate limiting (login, registration, password reset)
+- API keys or tokens exposed in URL query parameters (use headers)
+- Missing auth middleware on new routes
+- Batch/bulk endpoints without per-item authorization checks
+
+### Input Validation
+- Request body accepted without schema validation (missing Zod, Joi, Pydantic, etc.)
+- Mass assignment: entire request body spread into database model
+- File upload without size/type restrictions
+- Array inputs without length limits (DoS via large payloads)
+
+### Data Exposure
+- Sensitive fields in API responses (password hashes, internal IDs, tokens)
+- Stack traces or internal error details in production error responses
+- Verbose error messages that leak schema/implementation details
+
+---
+
+## Pass 2 — INFORMATIONAL (additions to base)
+
+### API Design
+- List endpoints without pagination (LIMIT/OFFSET or cursor-based)
+- Missing consistent error response format across endpoints
+- Inconsistent naming conventions (camelCase vs snake_case in same API)
+- Missing request/response content-type headers
+
+### Observability
+- New endpoints without logging/metrics
+- Error paths that swallow exceptions silently
+- Missing correlation/request IDs for tracing
+
+### Versioning & Compatibility
+- Breaking changes to existing response shapes without version bump
+- Removed fields without deprecation notice
+- Changed field types (string → number) in existing responses
--- a/.opencode/skills/code-review/references/checklists/base.md
+++ b/.opencode/skills/code-review/references/checklists/base.md
@@ -0,0 +1,100 @@
+# Base Review Checklist
+
+Universal checklist for all project types. Two-pass model: critical (blocking) + informational (non-blocking).
+
+## Instructions
+
+Review `git diff origin/main` for the issues below. Be specific — cite `file:line` and suggest fixes. Skip anything that's fine. Only flag real problems.
+
+**Output format:**
+
+```
+Pre-Landing Review: N issues (X critical, Y informational)
+
+**CRITICAL** (blocking):
+- [file:line] Problem description
+  Fix: suggested fix
+
+**Issues** (non-blocking):
+- [file:line] Problem description
+  Fix: suggested fix
+```
+
+If no issues: `Pre-Landing Review: No issues found.`
+
+Be terse. One line problem, one line fix. No preamble.
+
+---
+
+## Pass 1 — CRITICAL (blocking)
+
+### Injection & Data Safety
+- String interpolation in SQL/database queries (even with type casting — use parameterized queries)
+- Unsanitized user input written to database or rendered in HTML
+- Raw HTML output from user-controlled data (`innerHTML`, `dangerouslySetInnerHTML`, `html_safe`, `raw()`, `| safe`)
+- Command injection via string concatenation in shell commands (use argument arrays)
+- Path traversal via user input in file operations
+
+### Race Conditions & Concurrency
+- Read-check-write without atomic operations (check-then-set should be atomic WHERE + UPDATE)
+- Find-or-create without unique database constraint (concurrent calls create duplicates)
+- Status transitions without atomic WHERE old_status + UPDATE new_status
+- Shared mutable state accessed without synchronization
+
+### Security Boundaries
+- Missing authentication checks on new endpoints/routes
+- Privilege escalation paths (user can access/modify another user's data — IDOR)
+- Secrets in logs, error responses, or client-side code
+- LLM/AI output written to database or used in queries without validation
+- JWT/token comparison using `==` instead of constant-time comparison
+
+### Auth & Access Control
+- New API endpoints without auth middleware
+- Missing authorization check (authenticated but not authorized)
+- Admin-only operations accessible to regular users
+- Session fixation or token reuse vulnerabilities
+
+---
+
+## Pass 2 — INFORMATIONAL (non-blocking)
+
+### Conditional Side Effects
+- Code branches on condition but forgets side effect on one branch (e.g., sets status but not associated data)
+- Log messages claiming action happened but action was conditionally skipped
+
+### Magic Numbers & String Coupling
+- Bare numeric literals used in multiple files — should be named constants
+- Error message strings used as query filters elsewhere (grep for the string)
+
+### Dead Code & Consistency
+- Variables assigned but never read
+- Stale comments describing old behavior after code changed
+- Import/require statements for unused modules
+
+### Test Gaps
+- Missing negative-path tests (error cases, validation failures)
+- Assertions on type/status but not side effects (e.g., checks status but not that email was sent)
+- Missing integration tests for security enforcement (auth, rate limiting, access control)
+
+### Type Coercion at Boundaries
+- Values crossing language/system boundaries where type could change (string vs number)
+- Hash/digest inputs that don't normalize types before serialization
+
+### Performance
+- O(n*m) lookups in views/templates (array search inside loops — use hash/map lookup)
+- Missing pagination on list endpoints returning unbounded results
+- N+1 queries: loading associations inside loops without eager loading
+- Unbounded queries without LIMIT
+
+---
+
+## Suppressions — DO NOT flag these
+
+- Redundancy that aids readability (e.g., `present?` redundant with length check)
+- "Add comment explaining why this threshold was chosen" — thresholds change, comments rot
+- "This assertion could be tighter" when assertion already covers the behavior
+- Consistency-only changes (wrapping a value to match how another constant is guarded)
+- Harmless no-ops (e.g., `.filter()` on array that never contains the filtered value)
+- ANYTHING already addressed in the diff being reviewed — read the FULL diff before commenting
+- Style/formatting issues (use a linter for that)
+- "Consider using X instead of Y" when Y works fine
--- a/.opencode/skills/code-review/references/checklists/web-app.md
+++ b/.opencode/skills/code-review/references/checklists/web-app.md
@@ -0,0 +1,54 @@
+# Web App Review Checklist (Overlay)
+
+Additive to `base.md`. Apply when project has frontend framework (React, Vue, Svelte, Next.js, etc.).
+
+## Detection
+
+Apply this overlay when any of these are true:
+- `package.json` has `react`, `vue`, `svelte`, `next`, `nuxt`, `angular` dependency
+- Project has `src/pages/`, `src/app/`, `src/components/`, `src/views/` directories
+- HTML/JSX/TSX/Vue files in the diff
+
+---
+
+## Pass 1 — CRITICAL (additions to base)
+
+### XSS
+- `innerHTML` assignment from any non-static source
+- Template literals interpolated into DOM without escaping
+- URL parameters rendered without sanitization
+- `<a href={userInput}>` without protocol validation (javascript: protocol)
+- Server-rendered user content without HTML entity encoding
+
+### CSRF
+- State-changing endpoints (POST/PUT/DELETE) without CSRF token verification
+- Cookie-based auth without SameSite attribute
+- Form submissions to external URLs
+
+### N+1 Queries (server-rendered views)
+- Database queries inside loops rendering lists
+- Missing eager loading for associations rendered in views/pages
+- Sequential API calls that could be batched
+
+---
+
+## Pass 2 — INFORMATIONAL (additions to base)
+
+### Frontend Performance
+- Inline `<style>` blocks in components re-parsed every render
+- Missing `key` prop on list items
+- Large bundle imports that could be lazy-loaded (e.g., full lodash instead of lodash/get)
+- Images without width/height causing layout shift
+- Missing `loading="lazy"` on below-fold images
+
+### Accessibility
+- Interactive elements without keyboard support (onClick without onKeyDown)
+- Missing `alt` text on images
+- Form inputs without associated labels
+- Color-only indicators (no text/icon fallback)
+- Missing ARIA attributes on custom interactive components
+
+### Responsive / Layout
+- Fixed pixel widths that break on mobile
+- Missing viewport meta tag
+- Overflow hidden cutting off content on small screens
--- a/.opencode/skills/code-review/references/code-review-reception.md
+++ b/.opencode/skills/code-review/references/code-review-reception.md
@@ -0,0 +1,113 @@
+---
+name: receiving-code-review
+description: Use when receiving code review feedback, before implementing suggestions, especially if feedback seems unclear or technically questionable - requires technical rigor and verification, not performative agreement
+---
+
+# Code Review Reception
+
+**Core principle:** Verify before implementing. Ask before assuming. Technical correctness over social comfort.
+
+## Response Pattern
+
+```
+1. READ: Complete feedback without reacting
+2. UNDERSTAND: Restate requirement (or ask)
+3. VERIFY: Check against codebase reality
+4. EVALUATE: Technically sound for THIS codebase?
+5. RESPOND: Technical acknowledgment or reasoned pushback
+6. IMPLEMENT: One at a time, test each
+```
+
+## Forbidden Responses
+
+❌ "You're absolutely right!" / "Great point!" / "Thanks for [anything]"
+❌ "Let me implement that now" (before verification)
+
+✅ Restate technical requirement
+✅ Ask clarifying questions
+✅ Push back with technical reasoning
+✅ Just start working (actions > words)
+
+## Handling Unclear Feedback
+
+```
+IF any item unclear:
+  STOP - don't implement anything
+  ASK for clarification on ALL unclear items
+
+WHY: Items may be related. Partial understanding = wrong implementation.
+```
+
+## Source-Specific Handling
+
+**Human partner:** Trusted - implement after understanding, no performative agreement
+
+**External reviewers:**
+```
+BEFORE implementing:
+  1. Technically correct for THIS codebase?
+  2. Breaks existing functionality?
+  3. Reason for current implementation?
+  4. Works all platforms/versions?
+
+IF wrong: Push back with technical reasoning
+IF can't verify: State limitation, ask direction
+IF conflicts with partner's decisions: Stop, discuss first
+```
+
+## YAGNI Check
+
+```
+IF reviewer suggests "implementing properly":
+  grep codebase for actual usage
+  IF unused: "This isn't called. Remove it (YAGNI)?"
+  IF used: Implement properly
+```
+
+## Implementation Order
+
+```
+1. Clarify unclear items FIRST
+2. Implement: blocking → simple → complex
+3. Test each individually
+4. Verify no regressions
+```
+
+## When To Push Back
+
+- Breaks existing functionality
+- Reviewer lacks full context
+- Violates YAGNI (unused feature)
+- Technically incorrect for stack
+- Legacy/compatibility reasons
+- Conflicts with architectural decisions
+
+**How:** Technical reasoning, specific questions, reference working tests
+
+## Acknowledging Correct Feedback
+
+✅ "Fixed. [Brief description]"
+✅ "Good catch - [issue]. Fixed in [location]."
+✅ Just fix it (actions > words)
+
+❌ ANY gratitude or performative expression
+
+## Correcting Wrong Pushback
+
+✅ "You were right - checked [X], it does [Y]. Implementing."
+❌ Long apology, defending, over-explaining
+
+## Quick Reference
+
+| Mistake | Fix |
+|---------|-----|
+| Performative agreement | State requirement or act |
+| Blind implementation | Verify against codebase |
+| Batch without testing | One at a time |
+| Assuming reviewer right | Check if breaks things |
+| Avoiding pushback | Technical correctness > comfort |
+
+## Bottom Line
+
+External feedback = suggestions to evaluate, not orders.
+Verify. Question. Then implement.
--- a/.opencode/skills/code-review/references/codebase-scan-workflow.md
+++ b/.opencode/skills/code-review/references/codebase-scan-workflow.md
@@ -0,0 +1,30 @@
+# Codebase Scan Workflow
+
+Think harder to scan the codebase and analyze it follow the Orchestration Protocol, Core Responsibilities, Subagents Team and Development Rules:
+<tasks>$ARGUMENTS</tasks>
+
+## Role Responsibilities
+- You are an elite software engineering expert who specializes in system architecture design and technical decision-making.
+- You operate by: **YAGNI**, **KISS**, and **DRY**.
+- Sacrifice grammar for concision. List unresolved questions at end.
+
+## Workflow
+
+### Research
+* Use 2 `researcher` subagents in parallel to search up to 5 sources
+* Keep every research report concise (≤150 lines)
+* Use `/ck:scout` skill invocation to search the codebase
+
+### Code Review
+* Use multiple `code-reviewer` subagents in parallel to review code
+* If issues found, ask main agent to improve and repeat until tests pass
+* When complete, run adversarial review (see `adversarial-review.md`) — always-on, no exceptions
+* Report combined quality + adversarial findings to user
+
+### Plan
+* Use `planner` subagent to analyze reports and create improvement plan
+* Save overview at `plan.md`, phase files as `phase-XX-phase-name.md`
+
+### Final Report
+* Summary of changes, guide user to get started, suggest next steps
+* Ask user if they want to commit and push
--- a/.opencode/skills/code-review/references/edge-case-scouting.md
+++ b/.opencode/skills/code-review/references/edge-case-scouting.md
@@ -0,0 +1,119 @@
+---
+name: edge-case-scouting
+description: Use after implementation, before code review to proactively find edge cases, side effects, and potential issues via scout skill - catches problems code-reviewer might miss
+---
+
+# Edge Case Scouting
+
+Proactive detection of edge cases, side effects, and potential issues before code review.
+
+## Purpose
+
+Code reviews catch obvious issues but miss subtle side effects. Scout detects:
+- Files affected by changes reviewer might not check
+- Data flow paths that could break
+- Boundary conditions and error paths
+- Integration issues across modules
+
+## When to Use
+
+**Mandatory:** Multi-file features, shared utility refactors, complex bug fixes
+**Optional:** Single-file changes, docs, config
+
+## Process
+
+### 1. Identify Changed Files
+```bash
+git diff --name-only HEAD~1
+```
+
+### 2. Invoke Scout
+```
+/ck:scout edge cases for recent changes.
+
+Changed: {files from git diff}
+
+Find:
+1. Files importing/depending on changed modules
+2. Data flow paths through modified functions
+3. Error handling paths not tested
+4. Boundary conditions (null, empty, max)
+5. Race conditions in async code
+6. State management side effects
+```
+
+### 3. Analyze & Act
+
+| Finding | Action |
+|---------|--------|
+| Affected file not in scope | Add to review |
+| Data flow risk | Verify or add test |
+| Edge case | Add test or verify |
+| Missing test | Add before review |
+
+### 4. Document for Review
+```
+Scout findings:
+- {issues found}
+- Verified: {what checked}
+- Addressed: {what fixed}
+- Needs review: {remaining}
+```
+
+## Scout Prompts
+
+**Feature:**
+```
+Scout edge cases for {feature}.
+Changed: {files}
+Find: consumers, error states, untested inputs, performance, compatibility
+```
+
+**Bug fix:**
+```
+Scout side effects of fix in {file}.
+Bug: {description}, Fix: {approach}
+Find: other paths using logic, dependent features, similar bugs
+```
+
+**Refactor:**
+```
+Scout breaking changes in {module}.
+Before: {old}, After: {new}
+Find: importers, behavior diffs, removed functionality
+```
+
+## What Scout Catches
+
+| Issue | Why Missed | Scout Detects |
+|-------|------------|---------------|
+| Indirect deps | Not in diff | Traces imports |
+| Race conditions | Hard static review | Analyzes flow |
+| State mutations | Hidden side effects | Tracks data |
+| Missing null checks | Assumed safe | Boundary analysis |
+| Integration breaks | Out of scope | Cross-module search |
+
+## Red Flags
+
+- Shared utility changed but only one caller tested
+- Error path leads to unhandled rejection
+- State modified in place without notification
+- Breaking change without migration
+
+## Example
+
+```
+1. Done: Add cache to UserService.getUser()
+2. Diff: src/services/user-service.ts
+3. Scout: "edge cases for caching in getUser()"
+4. Report:
+   - ProfileComponent expects fresh data on edit
+   - AdminPanel loops getUser() (memory risk)
+   - No cache clear on updateUser()
+5. Fix: Add invalidation, maxSize
+6. Document for code-reviewer
+```
+
+## Bottom Line
+
+Scout before review. Don't trust "simple changes" - scout them anyway.
--- a/.opencode/skills/code-review/references/input-mode-resolution.md
+++ b/.opencode/skills/code-review/references/input-mode-resolution.md
@@ -0,0 +1,135 @@
+---
+name: input-mode-resolution
+description: How to parse code-review arguments and resolve PR number, commit hash, pending changes, or default context into a reviewable diff
+---
+
+# Input Mode Resolution
+
+Resolve `/code-review` arguments into a diff for the review pipeline.
+
+## Auto-Detection Rules
+
+Parse arguments left-to-right. First match wins.
+
+| Pattern | Mode | Example |
+|---------|------|---------|
+| `#\d+` | PR | `#123`, `#45` |
+| GitHub PR URL | PR | `https://github.com/org/repo/pull/123` |
+| `[0-9a-f]{7,40}` | Commit | `abc1234`, full SHA |
+| `--pending` | Pending | explicit flag |
+| `codebase` | Codebase | existing mode |
+| *(none + context)* | Default | recent changes |
+| *(none + no context)* | Prompt | ask user via `AskUserQuestion` |
+
+## Resolution Commands
+
+### PR Mode
+
+```bash
+# Extract PR number from argument
+PR_NUM=$(echo "$ARG" | grep -oE '[0-9]+$')
+
+# Fetch PR metadata
+gh pr view "$PR_NUM" --json title,body,files,additions,deletions,baseRefName,headRefName
+
+# Get the diff
+gh pr diff "$PR_NUM"
+
+# Get changed file list
+gh pr diff "$PR_NUM" --name-only
+```
+
+**Context passed to reviewers:**
+- PR title and description (intent)
+- Base branch (what it merges into)
+- Full diff
+- Changed file list for scout
+
+### Commit Mode
+
+```bash
+# Validate commit exists
+git cat-file -t "$COMMIT_HASH"
+
+# Get commit metadata
+git log -1 --format="%H%n%s%n%b" "$COMMIT_HASH"
+
+# Get the diff
+git show "$COMMIT_HASH" --stat
+git show "$COMMIT_HASH" -- # full diff
+
+# Changed files
+git show "$COMMIT_HASH" --name-only --format=""
+```
+
+**Context passed to reviewers:**
+- Commit message (intent)
+- Parent commit (what it changed from)
+- Full diff
+- Changed file list for scout
+
+### Pending Mode
+
+```bash
+# Staged changes
+git diff --cached
+
+# Unstaged changes
+git diff
+
+# Combined (staged + unstaged vs HEAD)
+git diff HEAD
+
+# Changed files
+git diff HEAD --name-only
+
+# Status overview
+git status --short
+```
+
+**Context passed to reviewers:**
+- No commit message yet — ask user for brief intent description
+- Combined diff (staged + unstaged)
+- Changed file list for scout
+
+### Default Mode
+
+Use recent changes already in conversation context. If no changes apparent, fall back to Prompt mode.
+
+### Prompt Mode
+
+When no arguments and no recent context, use `AskUserQuestion`:
+- Header: "Review Target"
+- Question: "What would you like to review?"
+- Options: Pending changes, Enter PR number, Enter commit hash, Full codebase scan, Parallel codebase audit
+
+For PR/commit options, follow up with second `AskUserQuestion` to get the number/hash.
+
+### Codebase Mode
+
+Codebase modes bypass diff resolution — they scan the full codebase instead.
+- `codebase` → hand off to `references/codebase-scan-workflow.md`
+- `codebase parallel` → hand off to `references/parallel-review-workflow.md`
+
+Both workflows include adversarial review (always-on).
+
+## Pipeline Handoff
+
+After resolving the diff, pass to the review pipeline:
+
+```
+Resolved diff
+  ├─ Changed files → Edge case scout
+  ├─ Full diff → Stage 1 (Spec compliance, if plan exists)
+  ├─ Full diff → Stage 2 (Code quality review)
+  └─ Full diff + findings → Stage 3 (Adversarial review)
+```
+
+## Error Handling
+
+| Error | Action |
+|-------|--------|
+| PR not found | `gh pr view` fails → report "PR #N not found in this repo" |
+| Commit not found | `git cat-file` fails → report "Commit not found — is it pushed?" |
+| No pending changes | `git diff HEAD` empty → report "No pending changes to review" |
+| Ambiguous input | Could be PR or commit → prefer PR (more common), note assumption |
--- a/.opencode/skills/code-review/references/parallel-review-workflow.md
+++ b/.opencode/skills/code-review/references/parallel-review-workflow.md
@@ -0,0 +1,76 @@
+# Parallel Review Workflow
+
+**Ultrathink** to exhaustively list ALL potential edge cases, then dispatch parallel `code-reviewer` agents to verify: <scope>$ARGUMENTS</scope>
+
+**IMPORTANT:** Activate needed skills. Ensure token efficiency. Sacrifice grammar for concision.
+
+## Workflow
+
+### 1. Ultrathink Edge Cases
+
+Main agent deeply analyzes the scope to LIST all potential edge cases FIRST:
+- Read `codebase-summary.md` for context
+- Use `/ck:scout` to find relevant files
+- **Think exhaustively** about what could go wrong:
+  - Null/undefined scenarios
+  - Boundary conditions (off-by-one, empty, max values)
+  - Error handling gaps
+  - Race conditions, async edge cases
+  - Input validation holes
+  - Security vulnerabilities
+  - Resource leaks
+  - Untested code paths
+
+**Output format:**
+```markdown
+## Edge Cases Identified
+
+### Category: [scope-area]
+1. [edge case description] → files: [file1, file2]
+```
+
+### 2. Categorize & Assign
+
+Group edge cases by similar scope for parallel verification:
+- Each category → one `code-reviewer` agent
+- Max 6 categories (merge small ones)
+- Each reviewer gets specific edge cases to VERIFY, not discover
+
+### 3. Parallel Verification
+
+Launch N `code-reviewer` subagents simultaneously:
+- Pass: category name, list of edge cases, relevant files
+- Task: **VERIFY** if each edge case is properly handled in code
+- Report: which edge cases are handled vs unhandled
+
+### 4. Aggregate Results
+
+```markdown
+## Edge Case Verification Report
+
+### Summary
+- Total edge cases: X
+- Handled: Y
+- Unhandled: Z
+- Partial: W
+
+### Unhandled Edge Cases (Need Fix)
+| # | Edge Case | File | Status |
+|---|-----------|------|--------|
+```
+
+### 5. Adversarial Review (Always-On)
+
+After aggregation, spawn adversarial reviewer (see `adversarial-review.md`) on the full scope:
+- Adversarial reviewer receives aggregated findings + unhandled edge cases as context
+- Actively tries to break the code beyond what edge case verification found
+- Adjudicate findings: Accept / Reject / Defer
+
+### 6. Auto-Fix Pipeline
+
+**IF** unhandled/partial edge cases found:
+- Ask: "Found N unhandled edge cases. Fix with /ck:fix --parallel? [Y/n]"
+
+### 7. Final Report
+- Summary of verification
+- Ask: "Commit? [Y/n]" → use `git-manager`
--- a/.opencode/skills/code-review/references/requesting-code-review.md
+++ b/.opencode/skills/code-review/references/requesting-code-review.md
@@ -0,0 +1,116 @@
+---
+name: requesting-code-review
+description: Use when completing tasks, implementing major features, or before merging to verify work meets requirements - dispatches code-reviewer subagent to review implementation against plan or requirements before proceeding
+---
+
+# Requesting Code Review
+
+Dispatch code-reviewer subagent to catch issues before they cascade.
+
+**Core principle:** Scout first, review often.
+
+## When to Request Review
+
+**Mandatory:**
+- After each task in subagent-driven development
+- After completing major feature
+- Before merge to main
+
+**Optional but valuable:**
+- When stuck (fresh perspective)
+- Before refactoring (baseline check)
+- After fixing complex bug
+
+## How to Request
+
+**0. Scout edge cases first (NEW):**
+```
+Before dispatching code-reviewer, invoke /ck:scout to find:
+- Files affected by changes (not just modified files)
+- Data flow paths that could break
+- Edge cases and boundary conditions
+- Potential side effects
+
+See: references/edge-case-scouting.md
+```
+
+**1. Get git SHAs:**
+```bash
+BASE_SHA=$(git rev-parse HEAD~1)  # or origin/main
+HEAD_SHA=$(git rev-parse HEAD)
+```
+
+**2. Dispatch code-reviewer subagent:**
+
+Use Task tool with `code-reviewer` type, fill template at `code-reviewer.md`
+
+**Placeholders:**
+- `{WHAT_WAS_IMPLEMENTED}` - What you just built
+- `{PLAN_OR_REQUIREMENTS}` - What it should do
+- `{BASE_SHA}` - Starting commit
+- `{HEAD_SHA}` - Ending commit
+- `{DESCRIPTION}` - Brief summary
+
+**3. Act on feedback:**
+- Fix Critical issues immediately
+- Fix Important issues before proceeding
+- Note Minor issues for later
+- Push back if reviewer is wrong (with reasoning)
+
+## Example
+
+```
+[Just completed Task 2: Add verification function]
+
+You: Let me request code review before proceeding.
+
+BASE_SHA=$(git log --oneline | grep "Task 1" | head -1 | awk '{print $1}')
+HEAD_SHA=$(git rev-parse HEAD)
+
+[Dispatch code-reviewer subagent]
+  WHAT_WAS_IMPLEMENTED: Verification and repair functions for conversation index
+  PLAN_OR_REQUIREMENTS: Task 2 from docs/plans/deployment-plan.md
+  BASE_SHA: a7981ec
+  HEAD_SHA: 3df7661
+  DESCRIPTION: Added verifyIndex() and repairIndex() with 4 issue types
+
+[Subagent returns]:
+  Strengths: Clean architecture, real tests
+  Issues:
+    Important: Missing progress indicators
+    Minor: Magic number (100) for reporting interval
+  Assessment: Ready to proceed
+
+You: [Fix progress indicators]
+[Continue to Task 3]
+```
+
+## Integration with Workflows
+
+**Subagent-Driven Development:**
+- Review after EACH task
+- Catch issues before they compound
+- Fix before moving to next task
+
+**Executing Plans:**
+- Review after each batch (3 tasks)
+- Get feedback, apply, continue
+
+**Ad-Hoc Development:**
+- Review before merge
+- Review when stuck
+
+## Red Flags
+
+**Never:**
+- Skip review because "it's simple"
+- Ignore Critical issues
+- Proceed with unfixed Important issues
+- Argue with valid technical feedback
+
+**If reviewer wrong:**
+- Push back with technical reasoning
+- Show code/tests that prove it works
+- Request clarification
+
+See template at: requesting-code-review/code-reviewer.md
--- a/.opencode/skills/code-review/references/spec-compliance-review.md
+++ b/.opencode/skills/code-review/references/spec-compliance-review.md
@@ -0,0 +1,43 @@
+---
+name: spec-compliance-review
+description: First-pass review checking implementation matches spec/plan requirements before quality review
+---
+
+# Spec Compliance Review
+
+## Purpose
+
+Verify implementation matches what was requested BEFORE evaluating code quality.
+Well-written code that doesn't match requirements is still wrong.
+
+## When to Use
+
+- After implementing features from a plan
+- Before code quality review pass
+- When plan/spec exists for the work being reviewed
+
+## Process
+
+1. **Load spec/plan** — Read the plan.md or phase file that defined this work
+2. **List requirements** — Extract every requirement, acceptance criterion
+3. **Check each requirement** against actual implementation:
+   - Present? → PASS
+   - Missing? → MISSING (must fix before quality review)
+   - Extra (not in spec)? → EXTRA (flag for removal unless justified)
+4. **Verdict:**
+   - All requirements met, no unjustified extras → PASS → proceed to quality review
+   - Missing requirements → FAIL → implementer fixes → re-review
+   - Unjustified extras → WARN → discuss with user
+
+## Checklist Template
+
+| # | Requirement | Status | Notes |
+|---|-------------|--------|-------|
+| 1 | [from spec] | PASS/MISSING/EXTRA | [evidence] |
+
+## Red Flags
+
+- Skipping spec review because "code looks good"
+- Accepting extra features without spec justification
+- Treating spec review as optional
+- Reviewing code quality before confirming spec compliance
--- a/.opencode/skills/code-review/references/task-management-reviews.md
+++ b/.opencode/skills/code-review/references/task-management-reviews.md
@@ -0,0 +1,155 @@
+# Review Task Management Patterns
+
+Track review pipeline execution via Claude Native Tasks (TaskCreate, TaskUpdate, TaskList).
+
+## When to Create Tasks
+
+| Review Scope | Tasks? | Rationale |
+|--------------|--------|-----------|
+| Single-file fix | No | Scout + review + done, overhead not worth it |
+| Multi-file feature (3+ files) | Yes | Track scout → review → fix → verify chain |
+| Parallel reviewers (2+ scopes) | Yes | Coordinate independent reviews |
+| Review cycle with Critical fixes | Yes | Dependencies between fix → re-verify |
+
+**3-Task Rule:** Skip task creation when review pipeline has <3 meaningful steps.
+
+## Review Pipeline as Tasks
+
+```
+TaskCreate: "Scout edge cases"         → pending
+TaskCreate: "Review implementation"    → pending, blockedBy: [scout]
+TaskCreate: "Adversarial review"       → pending, blockedBy: [review]
+TaskCreate: "Fix critical issues"      → pending, blockedBy: [adversarial]
+TaskCreate: "Verify fixes pass"        → pending, blockedBy: [fix]
+```
+
+Dependency chain auto-unblocks: scout → review → adversarial → fix → verify.
+
+## Task Schemas
+
+### Scout Task
+
+```
+TaskCreate(
+  subject: "Scout edge cases for {feature}",
+  activeForm: "Scouting edge cases",
+  description: "Identify affected files, data flows, boundary conditions. Changed: {files}",
+  metadata: { reviewStage: "scout", feature: "{feature}",
+              changedFiles: "src/auth.ts,src/middleware.ts",
+              priority: "P2", effort: "3m" }
+)
+```
+
+### Review Task
+
+```
+TaskCreate(
+  subject: "Review {feature} implementation",
+  activeForm: "Reviewing {feature}",
+  description: "Code-reviewer subagent reviews {BASE_SHA}..{HEAD_SHA}. Plan: {plan_ref}",
+  metadata: { reviewStage: "review", feature: "{feature}",
+              baseSha: "{BASE_SHA}", headSha: "{HEAD_SHA}",
+              priority: "P1", effort: "10m" },
+  addBlockedBy: ["{scout-task-id}"]
+)
+```
+
+### Adversarial Task
+
+```
+TaskCreate(
+  subject: "Adversarial review for {feature}",
+  activeForm: "Red-teaming {feature}",
+  description: "Spawn adversarial reviewer to break the code. See references/adversarial-review.md",
+  metadata: { reviewStage: "adversarial", feature: "{feature}",
+              priority: "P1", effort: "10m" },
+  addBlockedBy: ["{review-task-id}"]
+)
+```
+
+### Fix Task (created after adversarial finds issues)
+
+```
+TaskCreate(
+  subject: "Fix {severity} issues from review",
+  activeForm: "Fixing {severity} review issues",
+  description: "Address: {issue_list}",
+  metadata: { reviewStage: "fix", severity: "critical",
+              issueCount: 3, priority: "P1", effort: "15m" },
+  addBlockedBy: ["{review-task-id}"]
+)
+```
+
+### Verify Task
+
+```
+TaskCreate(
+  subject: "Verify fixes pass tests and build",
+  activeForm: "Verifying fixes",
+  description: "Run test suite, build, confirm 0 failures. Evidence before claims.",
+  metadata: { reviewStage: "verify", priority: "P1", effort: "5m" },
+  addBlockedBy: ["{fix-task-id}"]
+)
+```
+
+## Parallel Review Coordination
+
+For multi-scope reviews (e.g., backend + frontend changed independently):
+
+```
+// Create scoped review tasks — no blockedBy between them
+TaskCreate(subject: "Review backend auth changes",
+  metadata: { reviewStage: "review", scope: "src/api/,src/middleware/",
+              agentIndex: 1, totalAgents: 2, priority: "P1" })
+
+TaskCreate(subject: "Review frontend auth UI",
+  metadata: { reviewStage: "review", scope: "src/components/auth/",
+              agentIndex: 2, totalAgents: 2, priority: "P1" })
+
+// Both run simultaneously via separate code-reviewer subagents
+// Fix task blocks on BOTH completing:
+TaskCreate(subject: "Fix all review issues",
+  addBlockedBy: ["{backend-review-id}", "{frontend-review-id}"])
+```
+
+## Task Lifecycle
+
+```
+Scout:       pending → in_progress → completed (scout report returned)
+Review:      pending → in_progress → completed (reviewer findings returned)
+Adversarial: pending → in_progress → completed (red-team findings adjudicated)
+Fix:         pending → in_progress → completed (all Critical/Important fixed)
+Verify:      pending → in_progress → completed (tests pass, build clean)
+```
+
+### Handling Re-Reviews
+
+When fixes introduce new issues → create new review cycle:
+
+```
+TaskCreate(subject: "Re-review after fixes",
+  addBlockedBy: ["{fix-task-id}"],
+  metadata: { reviewStage: "review", cycle: 2, priority: "P1" })
+```
+
+Limit to 3 cycles. If still failing after cycle 3 → escalate to user.
+
+## Integration with Planning Tasks
+
+Review tasks are **separate from** cook/planning phase tasks.
+
+**When cook spawns review:**
+1. Cook completes implementation phase → creates review pipeline tasks
+2. Review pipeline executes (scout → review → adversarial → fix → verify)
+3. All review tasks complete → cook marks phase as reviewed
+4. Cook proceeds to next phase
+
+Review tasks reference the phase but don't block it directly — the orchestrator manages handoff.
+
+## Quality Check
+
+After pipeline registration: `Registered [N] review tasks (scout → review → adversarial → fix → verify chain)`
+
+## Error Handling
+
+If `TaskCreate` fails: log warning, fall back to sequential review without task tracking. Review pipeline functions identically — tasks add visibility, not functionality.
--- a/.opencode/skills/code-review/references/verification-before-completion.md
+++ b/.opencode/skills/code-review/references/verification-before-completion.md
@@ -0,0 +1,139 @@
+---
+name: verification-before-completion
+description: Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always
+---
+
+# Verification Before Completion
+
+## Overview
+
+Claiming work is complete without verification is dishonesty, not efficiency.
+
+**Core principle:** Evidence before claims, always.
+
+**Violating the letter of this rule is violating the spirit of this rule.**
+
+## The Iron Law
+
+```
+NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
+```
+
+If you haven't run the verification command in this message, you cannot claim it passes.
+
+## The Gate Function
+
+```
+BEFORE claiming any status or expressing satisfaction:
+
+1. IDENTIFY: What command proves this claim?
+2. RUN: Execute the FULL command (fresh, complete)
+3. READ: Full output, check exit code, count failures
+4. VERIFY: Does output confirm the claim?
+   - If NO: State actual status with evidence
+   - If YES: State claim WITH evidence
+5. ONLY THEN: Make the claim
+
+Skip any step = lying, not verifying
+```
+
+## Common Failures
+
+| Claim | Requires | Not Sufficient |
+|-------|----------|----------------|
+| Tests pass | Test command output: 0 failures | Previous run, "should pass" |
+| Linter clean | Linter output: 0 errors | Partial check, extrapolation |
+| Build succeeds | Build command: exit 0 | Linter passing, logs look good |
+| Bug fixed | Test original symptom: passes | Code changed, assumed fixed |
+| Regression test works | Red-green cycle verified | Test passes once |
+| Agent completed | VCS diff shows changes | Agent reports "success" |
+| Requirements met | Line-by-line checklist | Tests passing |
+
+## Red Flags - STOP
+
+- Using "should", "probably", "seems to"
+- Expressing satisfaction before verification ("Great!", "Perfect!", "Done!", etc.)
+- About to commit/push/PR without verification
+- Trusting agent success reports
+- Relying on partial verification
+- Thinking "just this once"
+- Tired and wanting work over
+- **ANY wording implying success without having run verification**
+
+## Rationalization Prevention
+
+| Excuse | Reality |
+|--------|---------|
+| "Should work now" | RUN the verification |
+| "I'm confident" | Confidence ≠ evidence |
+| "Just this once" | No exceptions |
+| "Linter passed" | Linter ≠ compiler |
+| "Agent said success" | Verify independently |
+| "I'm tired" | Exhaustion ≠ excuse |
+| "Partial check is enough" | Partial proves nothing |
+| "Different words so rule doesn't apply" | Spirit over letter |
+
+## Key Patterns
+
+**Tests:**
+```
+✅ [Run test command] [See: 34/34 pass] "All tests pass"
+❌ "Should pass now" / "Looks correct"
+```
+
+**Regression tests (TDD Red-Green):**
+```
+✅ Write → Run (pass) → Revert fix → Run (MUST FAIL) → Restore → Run (pass)
+❌ "I've written a regression test" (without red-green verification)
+```
+
+**Build:**
+```
+✅ [Run build] [See: exit 0] "Build passes"
+❌ "Linter passed" (linter doesn't check compilation)
+```
+
+**Requirements:**
+```
+✅ Re-read plan → Create checklist → Verify each → Report gaps or completion
+❌ "Tests pass, phase complete"
+```
+
+**Agent delegation:**
+```
+✅ Agent reports success → Check VCS diff → Verify changes → Report actual state
+❌ Trust agent report
+```
+
+## Why This Matters
+
+From 24 failure memories:
+- your human partner said "I don't believe you" - trust broken
+- Undefined functions shipped - would crash
+- Missing requirements shipped - incomplete features
+- Time wasted on false completion → redirect → rework
+- Violates: "Honesty is a core value. If you lie, you'll be replaced."
+
+## When To Apply
+
+**ALWAYS before:**
+- ANY variation of success/completion claims
+- ANY expression of satisfaction
+- ANY positive statement about work state
+- Committing, PR creation, task completion
+- Moving to next task
+- Delegating to agents
+
+**Rule applies to:**
+- Exact phrases
+- Paraphrases and synonyms
+- Implications of success
+- ANY communication suggesting completion/correctness
+
+## The Bottom Line
+
+**No shortcuts for verification.**
+
+Run the command. Read the output. THEN claim the result.
+
+This is non-negotiable.