Engineering Review — Q1 2026

API Gateway Redesign

From monolith proxy to edge-native routing

01

The Problem

Why the current gateway can't scale

Current State

Single Point of Failure

All traffic routes through one Node.js process
Rate limiting is per-instance, not distributed
Auth validation adds 40ms per request
No circuit breaking — cascade failures hit everything

Before

Monolith Proxy

Express.js middleware chain
In-memory rate limit counters
Synchronous JWT validation
Manual upstream health checks

After

Edge-Native

Cloudflare Workers at the edge
Durable Objects for distributed state
Async JWT with key caching
Automatic circuit breakers

02

Architecture

How the new system works

Request Flow

graph LR
  Client["Client"] --> Edge["Edge Worker"]
  Edge --> Auth["Auth Cache"]
  Edge --> RL["Rate Limiter
Durable Object"]
  Edge --> Router["Route Resolver"]
  Router --> API["API Service"]
  Router --> Static["Static Assets"]
  API --> DB["Database"]

  classDef primary fill:#d4a73a22,stroke:#d4a73a,stroke-width:2px
  classDef secondary fill:#60a5fa22,stroke:#60a5fa,stroke-width:2px
  classDef storage fill:#4ade8022,stroke:#4ade80,stroke-width:2px

  class Client,Edge primary
  class Auth,RL,Router secondary
  class API,Static,DB storage

Performance Impact

12ms

P99 Latency

↓ from 142ms

99.97%

Uptime

↑ from 99.2%

340

Edge Locations

global coverage

$0.02

Per 10K Requests

↓ 68% cost reduction

Migration Phases

Phase	Scope	Timeline	Risk
1. Shadow mode	Mirror traffic to edge, compare responses	Week 1–2	Low
2. Canary rollout	5% traffic to edge, monitor errors	Week 3	Low
3. Gradual shift	25% → 50% → 75% traffic	Week 4–5	Medium
4. Full cutover	100% traffic, decommission old proxy	Week 6	Medium
5. Cleanup	Remove feature flags, archive old code	Week 7	Low

Edge Worker Entry Point

gateway.ts

export default {
  async fetch(req: Request, env: Env) {
    // Auth check with edge-cached keys
    const identity = await verifyAuth(req, env);
    // Distributed rate limiting
    const limit = env.RATE_LIMITER.get(identity.id);
    if (await limit.check()) return tooMany();
    // Route to upstream
    return route(req, env.SERVICES);
  }
};

“

The fastest request is the one that never leaves the edge.

— Edge Computing Principle

Next Steps

Ship Shadow Mode This Week

Full cutover targeted for end of Q1