Engineering Review — Q1 2026

API Gateway Redesign

From monolith proxy to edge-native routing

01

The Problem

Why the current gateway can't scale

Current State

Single Point of Failure

  • All traffic routes through one Node.js process
  • Rate limiting is per-instance, not distributed
  • Auth validation adds 40ms per request
  • No circuit breaking — cascade failures hit everything
SINGLE PROCESS

Before

Monolith Proxy

  • Express.js middleware chain
  • In-memory rate limit counters
  • Synchronous JWT validation
  • Manual upstream health checks

After

Edge-Native

  • Cloudflare Workers at the edge
  • Durable Objects for distributed state
  • Async JWT with key caching
  • Automatic circuit breakers
02

Architecture

How the new system works

Request Flow

graph LR
  Client["Client"] --> Edge["Edge Worker"]
  Edge --> Auth["Auth Cache"]
  Edge --> RL["Rate Limiter
Durable Object"] Edge --> Router["Route Resolver"] Router --> API["API Service"] Router --> Static["Static Assets"] API --> DB["Database"] classDef primary fill:#d4a73a22,stroke:#d4a73a,stroke-width:2px classDef secondary fill:#60a5fa22,stroke:#60a5fa,stroke-width:2px classDef storage fill:#4ade8022,stroke:#4ade80,stroke-width:2px class Client,Edge primary class Auth,RL,Router secondary class API,Static,DB storage

Performance Impact

12ms
P99 Latency
↓ from 142ms
99.97%
Uptime
↑ from 99.2%
340
Edge Locations
global coverage
$0.02
Per 10K Requests
↓ 68% cost reduction

Migration Phases

PhaseScopeTimelineRisk
1. Shadow modeMirror traffic to edge, compare responsesWeek 1–2Low
2. Canary rollout5% traffic to edge, monitor errorsWeek 3Low
3. Gradual shift25% → 50% → 75% trafficWeek 4–5Medium
4. Full cutover100% traffic, decommission old proxyWeek 6Medium
5. CleanupRemove feature flags, archive old codeWeek 7Low

Edge Worker Entry Point

gateway.ts
export default {
  async fetch(req: Request, env: Env) {
    // Auth check with edge-cached keys
    const identity = await verifyAuth(req, env);
    // Distributed rate limiting
    const limit = env.RATE_LIMITER.get(identity.id);
    if (await limit.check()) return tooMany();
    // Route to upstream
    return route(req, env.SERVICES);
  }
};
The fastest request is the one that never leaves the edge.
— Edge Computing Principle

Next Steps

Ship Shadow Mode This Week

Full cutover targeted for end of Q1