Files
2026-04-12 01:06:31 +07:00

3.4 KiB

Scale Game

Test at extremes (1000x bigger/smaller, instant/year-long) to expose fundamental truths hidden at normal scales.

Core Principle

Extremes expose fundamentals. What works at one scale fails at another.

When to Use

Symptom Action
"Should scale fine" (without testing) Test at extremes
Uncertain about production behavior Scale up 1000x
Edge cases unclear Test minimum and maximum
Architecture validation needed Extreme testing

Quick Reference

Scale Dimension Test At Extremes What It Reveals
Volume 1 item vs 1B items Algorithmic complexity limits
Speed Instant vs 1 year Async requirements, caching needs
Users 1 user vs 1B users Concurrency issues, resource limits
Duration Milliseconds vs years Memory leaks, state growth
Failure rate Never fails vs always fails Error handling adequacy

Process

  1. Pick dimension - What could vary extremely?
  2. Test minimum - What if 1000x smaller/faster/fewer?
  3. Test maximum - What if 1000x bigger/slower/more?
  4. Note what breaks - Where do limits appear?
  5. Note what survives - What's fundamentally sound?
  6. Design for reality - Use insights to validate architecture

Detailed Examples

Example 1: Error Handling

  • Normal scale: "Handle errors when they occur" works fine
  • At 1B scale: Error volume overwhelms logging, crashes system
  • Reveals: Need to make errors impossible (type systems) or expect them (chaos engineering)
  • Action: Design error handling for volume, not just occurrence

Example 2: Synchronous APIs

  • Normal scale: Direct function calls work, < 100ms latency
  • At global scale: Network latency makes synchronous unusable (200-500ms)
  • Reveals: Async/messaging becomes survival requirement, not optimization
  • Action: Design async-first from start

Example 3: In-Memory State

  • Normal duration: Works for hours/days in development
  • At years: Memory grows unbounded, eventual crash (weeks → months → years)
  • Reveals: Need persistence or periodic cleanup, can't rely on memory forever
  • Action: Design for stateless or externalized state

Example 4: Single vs Million Users

  • Normal scale: Session in memory works for 100 users
  • At 1M scale: Memory exhausted, server crashes
  • Reveals: Need distributed session store (Redis, database)
  • Action: Design for horizontal scaling from start

Both Directions Matter

Test smaller too:

  • What if only 1 user? Does complexity make sense?
  • What if only 10 items? Is optimization premature?
  • What if instant response? What becomes unnecessary?

Often reveals over-engineering or premature optimization.

Red Flags

You need scale game when:

  • "It works in dev" (but will it work in production?)
  • No idea where limits are
  • "Should scale fine" (without evidence)
  • Surprised by production behavior
  • Architecture feels arbitrary

Success Metrics

After scale game, you should know:

  • Where system breaks (exact limits)
  • What survives (fundamentally sound parts)
  • What needs redesign (scale-dependent)
  • Production readiness (validated architecture)

Remember

  • Extremes reveal fundamentals hidden at normal scales
  • What works at one scale fails at another
  • Test BOTH directions (bigger AND smaller)
  • Use insights to validate architecture early
  • Don't guess - test at extremes