Files
english/.opencode/skills/ck-debug/references/performance-diagnostics.md
2026-04-12 01:06:31 +07:00

114 lines
3.8 KiB
Markdown

# Performance Diagnostics
Identify bottlenecks, analyze query performance, and develop optimization strategies.
## When to Use
- Response times increased significantly
- Application feels slow or unresponsive
- Database queries taking too long
- High CPU/memory/disk usage
- Resource exhaustion or OOM errors
## Diagnostic Process
### 1. Quantify the Problem
**Measure before optimizing.** Establish baseline and current state.
- What is the expected response time vs actual?
- When did degradation start? (correlate with changes)
- Which endpoints/operations are affected?
- Is it consistent or intermittent?
### 2. Identify the Bottleneck Layer
```
Request → Network → Web Server → Application → Database → Filesystem
External APIs / Services
```
**Elimination approach:** Measure time at each layer to find where delay occurs.
| Layer | Check | Tool |
|-------|-------|------|
| Network | Latency, DNS, TLS | `curl -w` timing, network logs |
| Web server | Request queue, connections | Server metrics, access logs |
| Application | CPU profiling, memory | Profiler, APM, `process.memoryUsage()` |
| Database | Query time, connections | `EXPLAIN ANALYZE`, `pg_stat_statements` |
| Filesystem | I/O wait, disk usage | `iostat`, `df -h` |
| External APIs | Response time, timeouts | Request logging with durations |
### 3. Database Performance
#### PostgreSQL Diagnostics
```sql
-- Slow queries (requires pg_stat_statements extension)
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC LIMIT 20;
-- Active queries right now
SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;
-- Table sizes and bloat
SELECT relname, pg_size_pretty(pg_total_relation_size(relid))
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC LIMIT 20;
-- Missing indexes (sequential scans on large tables)
SELECT relname, seq_scan, seq_tup_read, idx_scan
FROM pg_stat_user_tables
WHERE seq_scan > 100 AND seq_tup_read > 10000
ORDER BY seq_tup_read DESC;
-- Connection pool status
SELECT count(*), state FROM pg_stat_activity GROUP BY state;
```
#### Query Optimization
```sql
-- Analyze specific query execution plan
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) <your-query>;
```
**Look for:** Sequential scans on large tables, nested loops with high row counts, sorts without indexes, excessive buffer hits.
### 4. Application Performance
**Common bottlenecks:**
| Issue | Symptom | Fix |
|-------|---------|-----|
| N+1 queries | Many small DB calls per request | Eager loading, batch queries |
| Memory leaks | Growing memory over time | Profile heap, check event listeners |
| Blocking I/O | High response time, low CPU | Async operations, connection pooling |
| CPU-bound | High CPU, proportional to load | Optimize algorithms, caching |
| Connection exhaustion | Intermittent timeouts | Pool sizing, connection reuse |
| Large payloads | Slow transfers, high memory | Pagination, compression, streaming |
### 5. Optimization Strategy
**Priority order:**
1. **Quick wins** - Add missing index, fix N+1 query, enable caching
2. **Configuration** - Pool sizes, timeouts, buffer sizes, worker counts
3. **Code changes** - Algorithm optimization, data structure changes
4. **Architecture** - Caching layer, read replicas, async processing, CDN
**Always:** Measure after each change to verify improvement. One change at a time.
## Reporting Performance Issues
Include in diagnostic report:
- **Baseline vs current** metrics (with numbers)
- **Bottleneck identification** with evidence
- **Root cause** explanation
- **Recommended fixes** with expected impact
- **Verification plan** to confirm improvement