3.0 KiB
3.0 KiB
Context Degradation Patterns
Predictable degradation as context grows. Not binary - a continuum.
Degradation Patterns
| Pattern | Cause | Detection |
|---|---|---|
| Lost-in-Middle | U-shaped attention | Critical info recall drops 10-40% |
| Context Poisoning | Errors compound via reference | Persistent hallucinations despite correction |
| Context Distraction | Irrelevant info overwhelms | Single distractor degrades performance |
| Context Confusion | Multiple tasks mix | Wrong tool calls, mixed requirements |
| Context Clash | Contradictory info | Conflicting outputs, inconsistent reasoning |
Lost-in-Middle Phenomenon
- Information in middle gets 10-40% lower recall
- Models allocate massive attention to first token (BOS sink)
- As context grows, middle tokens fail to get sufficient attention
- Mitigation: Place critical info at beginning/end
[CURRENT TASK] # Beginning - high attention
- Critical requirements
[DETAILED CONTEXT] # Middle - lower attention
- Supporting details
[KEY FINDINGS] # End - high attention
- Important conclusions
Context Poisoning
Entry points:
- Tool outputs with errors/unexpected formats
- Retrieved docs with incorrect/outdated info
- Model-generated summaries with hallucinations
Detection symptoms:
- Degraded quality on previously successful tasks
- Tool misalignment (wrong tools/parameters)
- Persistent hallucinations
Recovery:
- Truncate to before poisoning point
- Explicit note + re-evaluation request
- Restart with clean context, preserve only verified info
Model Degradation Thresholds
| Model | Degradation Onset | Severe Degradation |
|---|---|---|
| GPT-5.2 | ~64K tokens | ~200K tokens |
| Claude Opus 4.5 | ~100K tokens | ~180K tokens |
| Claude Sonnet 4.5 | ~80K tokens | ~150K tokens |
| Gemini 3 Pro | ~500K tokens | ~800K tokens |
Four-Bucket Mitigation
- Write: Save externally (scratchpads, files)
- Select: Pull only relevant (retrieval, filtering)
- Compress: Reduce tokens (summarization)
- Isolate: Split across sub-agents (partitioning)
Detection Heuristics
def calculate_health(utilization, degradation_risk, poisoning_risk):
"""Health score: 1.0 = healthy, 0.0 = critical"""
score = 1.0
score -= utilization * 0.5 if utilization > 0.7 else 0
score -= degradation_risk * 0.3
score -= poisoning_risk * 0.2
return max(0, score)
# Thresholds: healthy >0.8, warning >0.6, degraded >0.4, critical <=0.4
Guidelines
- Monitor context length vs performance correlation
- Place critical info at beginning/end
- Implement compaction before degradation
- Validate retrieved docs before adding
- Use versioning to prevent outdated clash
- Segment tasks to prevent confusion
- Design for graceful degradation
Related Topics
- Context Optimization - Mitigation techniques
- Multi-Agent Patterns - Isolation strategies