# Backend DevOps Practices CI/CD pipelines, containerization, deployment strategies, and monitoring (2025). ## Deployment Strategies ### Blue-Green Deployment **Concept:** Two identical environments (Blue = current, Green = new) ``` Production Traffic → Blue (v1.0) Green (v2.0) ← Deploy & Test Switch: Production Traffic → Green (v2.0) Blue (v1.0) ← Instant rollback available ``` **Pros:** - Zero downtime - Instant rollback - Full environment testing before switch **Cons:** - Requires double infrastructure - Database migrations complex ### Canary Deployment **Concept:** Gradual rollout (1% → 5% → 25% → 100%) ```bash # Kubernetes canary deployment kubectl set image deployment/api api=myapp:v2 kubectl rollout pause deployment/api # Pause at initial replicas # Monitor metrics, then continue kubectl rollout resume deployment/api ``` **Pros:** - Risk mitigation - Early issue detection - Real user feedback **Cons:** - Requires monitoring - Longer deployment time ### Feature Flags (Progressive Delivery) **Impact:** 90% fewer deployment failures when combined with canary ```typescript import { LaunchDarkly } from 'launchdarkly-node-server-sdk'; const client = LaunchDarkly.init(process.env.LD_SDK_KEY); // Check feature flag const showNewCheckout = await client.variation('new-checkout', user, false); if (showNewCheckout) { return newCheckoutFlow(req, res); } else { return oldCheckoutFlow(req, res); } ``` **Use Cases:** - Gradual feature rollout - A/B testing - Kill switch for problematic features - Decouple deployment from release ## Containerization with Docker ### Multi-Stage Builds (Optimize Image Size) ```dockerfile # Build stage FROM node:20-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci --only=production COPY . . RUN npm run build # Production stage FROM node:20-alpine WORKDIR /app # Copy only necessary files COPY --from=builder /app/dist ./dist COPY --from=builder /app/node_modules ./node_modules COPY package.json ./ # Security: Run as non-root RUN addgroup -g 1001 -S nodejs && \ adduser -S nodejs -u 1001 USER nodejs EXPOSE 3000 CMD ["node", "dist/main.js"] ``` **Benefits:** - Smaller image size (50-90% reduction) - Faster deployments - Reduced attack surface ### Docker Compose (Local Development) ```yaml version: '3.8' services: api: build: . ports: - "3000:3000" environment: - DATABASE_URL=postgresql://postgres:password@db:5432/myapp - REDIS_URL=redis://redis:6379 depends_on: - db - redis db: image: postgres:15-alpine environment: - POSTGRES_PASSWORD=password - POSTGRES_DB=myapp volumes: - postgres-data:/var/lib/postgresql/data redis: image: redis:7-alpine ports: - "6379:6379" volumes: postgres-data: ``` ## Kubernetes Orchestration ### Deployment Manifest ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: api-deployment spec: replicas: 3 selector: matchLabels: app: api template: metadata: labels: app: api spec: containers: - name: api image: myregistry/api:v1.0.0 ports: - containerPort: 3000 env: - name: DATABASE_URL valueFrom: secretKeyRef: name: db-secret key: url resources: requests: memory: "256Mi" cpu: "250m" limits: memory: "512Mi" cpu: "500m" livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 3000 initialDelaySeconds: 5 periodSeconds: 5 ``` ### Horizontal Pod Autoscaling ```yaml apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: api-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: api-deployment minReplicas: 3 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 ``` ## CI/CD Pipelines ### GitHub Actions (Modern, Integrated) ```yaml name: CI/CD Pipeline on: push: branches: [main] pull_request: branches: [main] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Setup Node.js uses: actions/setup-node@v3 with: node-version: '20' cache: 'npm' - name: Install dependencies run: npm ci - name: Run linter run: npm run lint - name: Run tests run: npm run test:ci - name: Upload coverage uses: codecov/codecov-action@v3 security: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Run Snyk scan uses: snyk/actions/node@master env: SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }} - name: Container scan run: | docker build -t myapp:${{ github.sha }} . docker scan myapp:${{ github.sha }} deploy: needs: [test, security] runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' steps: - uses: actions/checkout@v3 - name: Build and push Docker image run: | echo ${{ secrets.GITHUB_TOKEN }} | docker login ghcr.io -u ${{ github.actor }} --password-stdin docker build -t ghcr.io/${{ github.repository }}:${{ github.sha }} . docker push ghcr.io/${{ github.repository }}:${{ github.sha }} - name: Deploy to Kubernetes run: | kubectl set image deployment/api api=ghcr.io/${{ github.repository }}:${{ github.sha }} kubectl rollout status deployment/api ``` ## Monitoring & Observability ### Three Pillars of Observability **1. Metrics (Prometheus + Grafana)** ```typescript import { Counter, Histogram, register } from 'prom-client'; // Request counter const httpRequestTotal = new Counter({ name: 'http_requests_total', help: 'Total HTTP requests', labelNames: ['method', 'route', 'status'], }); // Response time histogram const httpRequestDuration = new Histogram({ name: 'http_request_duration_seconds', help: 'HTTP request duration', labelNames: ['method', 'route'], buckets: [0.1, 0.5, 1, 2, 5], }); // Middleware to track metrics app.use((req, res, next) => { const start = Date.now(); res.on('finish', () => { const duration = (Date.now() - start) / 1000; httpRequestTotal.inc({ method: req.method, route: req.route?.path, status: res.statusCode }); httpRequestDuration.observe({ method: req.method, route: req.route?.path }, duration); }); next(); }); // Metrics endpoint app.get('/metrics', async (req, res) => { res.set('Content-Type', register.contentType); res.end(await register.metrics()); }); ``` **2. Logs (ELK Stack - Elasticsearch, Logstash, Kibana)** ```typescript import winston from 'winston'; import { ElasticsearchTransport } from 'winston-elasticsearch'; const logger = winston.createLogger({ level: 'info', format: winston.format.json(), transports: [ new winston.transports.Console(), new ElasticsearchTransport({ level: 'info', clientOpts: { node: 'http://localhost:9200' }, index: 'logs', }), ], }); // Structured logging logger.info('User created', { userId: user.id, email: user.email, ipAddress: req.ip, userAgent: req.headers['user-agent'], }); ``` **3. Traces (Jaeger/OpenTelemetry)** ```typescript import { NodeSDK } from '@opentelemetry/sdk-node'; import { JaegerExporter } from '@opentelemetry/exporter-jaeger'; const sdk = new NodeSDK({ traceExporter: new JaegerExporter({ endpoint: 'http://localhost:14268/api/traces', }), serviceName: 'api-service', }); sdk.start(); // Traces automatically captured for HTTP requests, database queries, etc. ``` ### Health Checks ```typescript // Liveness probe - Is the app running? app.get('/health/liveness', (req, res) => { res.status(200).json({ status: 'ok', timestamp: Date.now() }); }); // Readiness probe - Is the app ready to serve traffic? app.get('/health/readiness', async (req, res) => { const checks = { database: await checkDatabase(), redis: await checkRedis(), externalAPI: await checkExternalAPI(), }; const isReady = Object.values(checks).every(Boolean); res.status(isReady ? 200 : 503).json({ status: isReady ? 'ready' : 'not ready', checks, }); }); async function checkDatabase() { try { await db.query('SELECT 1'); return true; } catch { return false; } } ``` ## Secrets Management ### HashiCorp Vault ```bash # Store secret vault kv put secret/myapp/db password=super-secret # Retrieve secret vault kv get -field=password secret/myapp/db ``` ### Kubernetes Secrets ```yaml apiVersion: v1 kind: Secret metadata: name: db-secret type: Opaque stringData: url: postgresql://user:pass@host:5432/db --- # Reference in deployment env: - name: DATABASE_URL valueFrom: secretKeyRef: name: db-secret key: url ``` ## Infrastructure as Code (Terraform) ```hcl # main.tf resource "aws_db_instance" "main" { identifier = "myapp-db" engine = "postgres" engine_version = "15.3" instance_class = "db.t3.micro" allocated_storage = 20 username = "admin" password = var.db_password backup_retention_period = 7 skip_final_snapshot = false } resource "aws_elasticache_cluster" "redis" { cluster_id = "myapp-redis" engine = "redis" node_type = "cache.t3.micro" num_cache_nodes = 1 parameter_group_name = "default.redis7" } ``` ## DevOps Checklist - [ ] CI/CD pipeline configured (GitHub Actions/GitLab CI/Jenkins) - [ ] Docker multi-stage builds implemented - [ ] Kubernetes deployment manifests created - [ ] Blue-green or canary deployment strategy - [ ] Feature flags configured (LaunchDarkly/Unleash) - [ ] Health checks (liveness + readiness probes) - [ ] Monitoring: Prometheus + Grafana - [ ] Logging: ELK Stack or similar - [ ] Distributed tracing: Jaeger/OpenTelemetry - [ ] Secrets management (Vault/AWS Secrets Manager) - [ ] Infrastructure as Code (Terraform/CloudFormation) - [ ] Autoscaling configured - [ ] Backup and disaster recovery plan ## Resources - **Kubernetes:** https://kubernetes.io/docs/ - **Docker:** https://docs.docker.com/ - **Prometheus:** https://prometheus.io/docs/ - **OpenTelemetry:** https://opentelemetry.io/docs/ - **Terraform:** https://www.terraform.io/docs/