init ingestion

This commit is contained in:
2026-05-24 22:59:24 +07:00
commit 4e8c11d545
80 changed files with 5639 additions and 0 deletions

83
.gitignore vendored Normal file
View File

@@ -0,0 +1,83 @@
### macOS
# Finder metadata
.DS_Store
# Thumbnails
._*
# Custom folder icons
Icon
# Volume root files
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
### Go
# Binaries for programs and plugins
*.exe
*.exe~
*.dll
*.so
*.dylib
# Test binary, built with `go test -c`
*.test
# Output of the go coverage tool
*.out
# Go workspace file
go.work.sum
# env file
.env
### Node
# Dependencies
node_modules/
# Logs
*.log
# Runtime data
*.pid
*.pid.lock
# Coverage
coverage/
*.lcov
.nyc_output
# Build output
dist/
build/Release
# TypeScript cache
*.tsbuildinfo
# Framework build output and caches
.cache
.parcel-cache
.next
out/
.nuxt
# dotenv environment variable files
.env
.env.local
.env.*.local
# npm cache directory
.npm
*.tgz
# yarn v2
.yarn/cache
.yarn/unplugged
.yarn/install-state.gz

View File

@@ -0,0 +1,415 @@
# CLAUDE.md — CDP Analytics Service
> You are a senior software engineer building the **Analytics & Data Layer** for a self-hosted CDP platform.
> This service focuses on **query, explore, and activate** data already ingested into ClickHouse.
>
> **Scope boundary**: Read-side only. Never write raw events. Ingestion is handled by `cdp-ingestion`.
---
## What This Service Does
Exposes ingested event data via Query API for exploration and analysis. Computes Traits and Audience
Segments from event history via background workers. Activates segments to external tools via Reverse ETL
and webhooks.
---
## Repository Layout
```
cdp-analytics/
├── api/ # Go — Query API, Profile API (port 4000)
├── workers/ # Go — Background jobs: Computed Traits, Segment refresh
├── console/ # React + Vite + shadcn/ui + Tailwind — Analytics UI
└── infra/
├── migrations/ # PostgreSQL migrations (golang-migrate)
└── clickhouse/ # ClickHouse query templates (.sql files)
```
---
## Tech Stack
### Go Services (api, workers)
| Concern | Library | Notes |
|---------|---------|-------|
| HTTP router | `chi` | Lightweight, stdlib-compatible middleware |
| Logger | `zap` | Structured logging, fastest |
| PostgreSQL | `pgx/v5` | Native driver, no database/sql wrapper |
| ClickHouse | `clickhouse-go/v2` | Official driver, native protocol, good batch support |
| Redis | `rueidis` | Modern client, faster than go-redis |
| Job queue | `riverqueue/river` | Postgres-backed, pgx/v5 native, built-in scheduler + retry |
| Config | `caarlos0/env` | Parse env vars into structs, zero deps |
| Validation | `go-playground/validator/v10` | Struct tags validation |
| Migration | `golang-migrate` + pgx driver | CLI only — never auto-migrate on startup |
| Test assertion | `testify` | assert + require + mock |
| Integration test | `testcontainers-go` | Real PG / Redis / ClickHouse in tests |
### React Console (console/)
| Concern | Library |
|---------|---------|
| Build | Vite |
| UI components | shadcn/ui + Tailwind |
| Routing | React Router v6 |
| Server state | TanStack Query |
| Client state | Zustand |
| Forms | react-hook-form + zod |
| Charts | Recharts |
| Icons | lucide-react |
> **No new technology** without discussion. All additions must justify why existing stack cannot handle it.
---
## Go Project Structure
### api/
```
api/
├── cmd/
│ └── server/
│ └── main.go # wire everything, start server
└── internal/
├── handler/ # HTTP handlers — parse request, call service, write response
├── service/ # business logic — no HTTP, no DB concerns
├── repo/ # DB queries — PostgreSQL via pgx, ClickHouse via clickhouse-go
├── middleware/ # auth, request ID, logging
└── config/ # env parsing via caarlos0/env
```
### workers/
```
workers/
├── cmd/
│ └── worker/
│ └── main.go # register jobs, start river worker
└── internal/
├── job/ # job definitions (ComputeTraitsJob, RefreshSegmentJob, ReverseETLJob)
├── handler/ # job handlers — business logic per job type
├── repo/ # DB queries shared across job handlers
└── config/
```
Rules:
- `handler` depends on `service` (api) or `handler` on `repo` (workers). Never reverse.
- `handler` never touches DB directly in api/.
- `service` never imports `chi` or any HTTP package.
- `repo` returns domain types, never raw `pgx.Rows` or `driver.Rows`.
- ClickHouse queries live as `.sql` files in `infra/clickhouse/` — no inline SQL strings for complex queries.
---
## Error Handling
Same `AppError` pattern as ingestion. Never return raw `pgx` or `clickhouse-go` errors to handlers.
```go
// internal/apperr/apperr.go
type AppError struct {
Code int // HTTP status code to return
Message string // user-facing message (safe to expose)
Field string // optional: which field caused the error
Err error // original error for logging (not exposed to user)
}
func (e *AppError) Error() string { return e.Message }
func (e *AppError) Unwrap() error { return e.Err }
// Constructors
func BadRequest(msg, field string, err error) *AppError
func NotFound(msg string) *AppError
func Forbidden(msg string) *AppError
func Internal(err error) *AppError
```
Handler pattern — one place handles all errors:
```go
func writeError(w http.ResponseWriter, err error) {
var appErr *apperr.AppError
if errors.As(err, &appErr) {
render.JSON(w, appErr.Code, ErrorResponse{Error: appErr.Message, Field: appErr.Field})
return
}
render.JSON(w, 500, ErrorResponse{Error: "internal server error"})
}
```
---
## ClickHouse Query Pattern
Use raw SQL only. No query builder — ClickHouse SQL has its own syntax that builders handle poorly.
```
infra/clickhouse/
├── event_explorer.sql
├── funnel_analysis.sql
├── retention_cohort.sql
└── session_analysis.sql
```
Load templates at startup, inject parameters safely:
```go
// Never fmt.Sprintf into SQL — use named parameters
query, err := templates.Load("funnel_analysis.sql")
rows, err := chConn.Query(ctx, query, clickhouse.Named("workspace_id", id), ...)
```
Rules:
- All ClickHouse queries must have a corresponding `.sql` file in `infra/clickhouse/`
- No multi-line SQL strings inline in Go code
- Every ClickHouse schema change must have a DDL file in `infra/clickhouse/`
---
## Job Queue (river)
Background workers use `riverqueue/river` backed by PostgreSQL.
```go
// Define a job
type ComputeTraitsArgs struct {
WorkspaceID string `json:"workspace_id"`
TraitID string `json:"trait_id"`
}
func (ComputeTraitsArgs) Kind() string { return "compute_traits" }
// Register handler
river.AddWorker(workers, &ComputeTraitsWorker{repo: repo})
// Enqueue
client.Insert(ctx, ComputeTraitsArgs{WorkspaceID: "ws_123", TraitID: "t_456"}, nil)
```
Scheduled jobs (periodic):
```go
// Hourly trait recompute, hourly segment refresh
&river.PeriodicJob{
ScheduleFunc: river.ScheduleFunc(func(t time.Time) time.Time {
return t.Add(time.Hour)
}),
ConstructorFunc: func() (river.JobArgs, *river.InsertOpts) {
return ComputeTraitsArgs{}, nil
},
}
```
Rules:
- Workers must be idempotent — river may retry on failure
- Use `river`'s built-in retry with exponential backoff, do not implement custom retry
- Log job start, job end, duration, and error with full context (job_id, args)
---
## Cache Strategy (Redis)
Semantic key structure — allows per-workspace invalidation:
```
cache:query:events:{workspace_id}:{hash(params)} TTL 60s
cache:query:funnel:{workspace_id}:{hash(params)} TTL 60s
cache:query:retention:{workspace_id}:{hash(params)} TTL 60s
cache:dashboard:{workspace_id} TTL 60s
cache:profile:{workspace_id}:{profile_id} TTL 30s
```
Rules:
- Default TTL: 60s for aggregate queries, 30s for profile lookups
- TTL is configurable per query type via env vars
- On cache miss: query ClickHouse, write result to Redis, return result
- Never cache Custom SQL results — each query is arbitrary
---
## Custom SQL Sandbox
`POST /query/sql` allows arbitrary SQL on ClickHouse. Two layers of protection:
**Layer 1 — App-level parse (Go):**
```go
// Reject anything that is not a SELECT statement
func validateReadOnly(sql string) error {
normalized := strings.TrimSpace(strings.ToUpper(sql))
if !strings.HasPrefix(normalized, "SELECT") {
return apperr.BadRequest("only SELECT statements are allowed", "sql", nil)
}
// Reject common DDL/DML keywords
forbidden := []string{"INSERT", "UPDATE", "DELETE", "DROP", "CREATE", "ALTER", "TRUNCATE"}
for _, kw := range forbidden {
if strings.Contains(normalized, kw) {
return apperr.BadRequest("statement contains forbidden keyword: "+kw, "sql", nil)
}
}
return nil
}
```
**Layer 2 — ClickHouse read-only user:**
- Custom SQL queries run as a separate ClickHouse user with `SELECT`-only grants
- DDL/DML rejected at DB level even if app-level check is bypassed
---
## Testing Strategy
### Unit tests — handler + service layer
- Mock interfaces with `testify/mock`
- No real DB, no real Redis, no real ClickHouse
- File: `foo_test.go` alongside the file being tested
### Integration tests — repo layer only
- Use `testcontainers-go` to spin up real PostgreSQL, Redis, ClickHouse
- File: `internal/repo/event_repo_test.go`
- Tag: `//go:build integration`
```bash
make test # unit only (fast, no containers)
make test/integration # repo layer with real DBs (slower, CI)
```
---
## Migration Workflow
```bash
make migrate/new name=add_profile_traits # create up+down files
make migrate/up # apply all pending
make migrate/down # rollback one step
make migrate/status # show current version
```
- Migration files: `infra/migrations/{version}_{name}.up.sql` + `.down.sql`
- **Never** auto-run migrations on server startup
- Every PostgreSQL schema change **must** have a migration file
---
## PostgreSQL Schema (Analytics-owned tables)
```sql
-- Computed trait values per profile
profile_traits (
profile_id UUID,
trait_key TEXT,
trait_value JSONB,
computed_at TIMESTAMPTZ
)
-- Segment membership history (used for delta Reverse ETL)
segment_memberships (
segment_id UUID,
profile_id UUID,
entered_at TIMESTAMPTZ,
exited_at TIMESTAMPTZ -- NULL = currently a member
)
```
---
## Data Sources (Read-only)
This service **only reads** data written by `cdp-ingestion`. Never write to these tables.
| Source | Data |
|--------|------|
| ClickHouse `events` | Flattened, schema-managed raw events |
| PostgreSQL `profiles` | Identity graph, unified profiles |
| PostgreSQL `sources` / `destinations` | Config metadata |
| PostgreSQL `schemas` | Schema registry from ingestion |
---
## Key Design Decisions
| Problem | Decision |
|---------|---------|
| Job queue | `river` on PostgreSQL — no Temporal, no Celery |
| Computed Traits refresh | Hourly default, configurable per trait |
| Segment re-evaluate | Full re-evaluate — simpler than incremental |
| Query cache | Redis semantic keys, TTL 60s default |
| Custom SQL | App-level SELECT-only check + ClickHouse read-only user |
| Reverse ETL | Delta only (entered/exited) — never push full member list |
| ClickHouse queries | Raw SQL in `.sql` template files — no query builder |
| Scaling | Vertical — increase RAM/CPU, not instances |
| Migration | CLI only — never auto-migrate on startup |
---
## API Endpoints
| Method | Path | Description |
|--------|------|-------------|
| `POST` | `/query/events` | Filter + query raw events |
| `POST` | `/query/sql` | Custom SQL on ClickHouse (SELECT only) |
| `POST` | `/query/funnel` | Funnel analysis |
| `POST` | `/query/retention` | Retention cohort |
| `GET` | `/profiles/:id` | Unified profile lookup |
| `GET` | `/profiles/:id/events` | User event timeline |
| `GET` | `/segments` | List segments |
| `POST` | `/segments` | Create segment |
| `GET` | `/segments/:id/members` | Segment members |
| `GET` | `/traits/definitions` | List computed trait definitions |
| `GET` | `/health` | Health check |
| `GET` | `/ready` | Readiness check |
Every endpoint must have a request struct with `validate` tags. Validation runs before any business logic.
---
## Feature Priorities
| Priority | Features |
|----------|---------|
| **P0** | Event Explorer, Custom SQL, Profile Lookup, Event Timeline, Saved Queries |
| **P1** | Funnel Analysis, Retention Analysis, Session Analysis, Pre-built Dashboards |
| **P2** | Computed Traits, Audience Segments, Background Worker |
| **P3** | Reverse ETL, Webhook Push, Schema Registry, Data Catalog |
Build in priority order. Do not start P1 before P0 is stable.
---
## Logging Policy (zap)
```
Query requests → log workspace_id, query_type, duration_ms, rows_returned, cache_hit
Worker jobs → log job_id, job_kind, args, duration_ms, status (success/error)
Errors → log full error chain with context
```
---
## Coding Rules
- **Do not write code unless asked** — discuss architecture/features first
- **Ask when scope is unclear**, especially when multiple valid approaches exist
- **YAGNI + KISS** — do not build what is not needed yet
- **Correctness before performance** — optimize only when profiling proves it necessary
- **Every PostgreSQL schema change must have a migration file** in `infra/migrations/`
- **Every ClickHouse query must have a `.sql` file** in `infra/clickhouse/`
- **Every API endpoint must have a request struct with `validate` tags**
- **Never write raw events** — this service is read-side only
- Discuss in **Vietnamese**, write code and comments in **English**
---
## Common Pitfalls
- Do not query ClickHouse directly for computed traits at request time — serve from PostgreSQL
- Do not run full segment scans on every API request — that is the worker's job
- Do not cache Custom SQL results — queries are arbitrary, cache would be useless
- Do not inline complex SQL strings in Go — use `.sql` template files
- Do not return raw `pgx` or `clickhouse-go` errors to HTTP handlers — wrap with `AppError`
- Do not run migrations on server startup — use `make migrate/up` explicitly
- Reverse ETL must push delta only (entered/exited), never the full member list per run
- Workers must be idempotent — `river` retries on failure, job may run more than once
- `service` layer must never import `net/http` or `chi`

54
ingestion/.env.example Normal file
View File

@@ -0,0 +1,54 @@
# ---------------------------------------------------------------------------
# Shared infrastructure
# ---------------------------------------------------------------------------
POSTGRES_DSN=postgres://cdp:cdp@localhost:5432/cdp?sslmode=disable
REDIS_ADDR=localhost:6379
KAFKA_BROKERS=localhost:9092
CLICKHOUSE_ADDR=localhost:9000
CLICKHOUSE_DB=cdp
CLICKHOUSE_USER=default
CLICKHOUSE_PASSWORD=
# ---------------------------------------------------------------------------
# Ingest service
# ---------------------------------------------------------------------------
INGEST_HTTP_ADDR=:3049
INGEST_LOG_LEVEL=info
INGEST_PAYLOAD_LIMIT_KB=100
INGEST_BATCH_LIMIT_KB=4000
INGEST_LATE_EVENT_HOURS=24
INGEST_DEDUP_TTL_HOURS=24
INGEST_WRITE_KEY_CACHE_TTL_SECONDS=45
INGEST_LOG_PAYLOAD_ON_SUCCESS=false
INGEST_LOG_PAYLOAD_ON_ERROR=true
INGEST_SHUTDOWN_TIMEOUT_SECONDS=30
# Kafka topics
KAFKA_TOPIC_INGEST=events.ingest
KAFKA_TOPIC_DLQ=events.dlq
KAFKA_TOPIC_RETRY=events.retry
# ---------------------------------------------------------------------------
# Bulker service
# ---------------------------------------------------------------------------
BULKER_HTTP_ADDR=:3042
BULKER_LOG_LEVEL=info
BULKER_KAFKA_GROUP=bulker
BULKER_BATCH_SIZE=1000
BULKER_BATCH_INTERVAL_SECONDS=5
BULKER_SHUTDOWN_TIMEOUT_SECONDS=60
# ---------------------------------------------------------------------------
# Rotor (Node.js)
# ---------------------------------------------------------------------------
ROTOR_PORT=3401
ROTOR_LOG_LEVEL=info
ROTOR_ISOLATE_MEMORY_MB=128
ROTOR_FUNCTION_TIMEOUT_MS=2000
# ---------------------------------------------------------------------------
# Console (Vite)
# ---------------------------------------------------------------------------
VITE_API_BASE_URL=http://localhost:3049
VITE_BULKER_BASE_URL=http://localhost:3042
VITE_ROTOR_BASE_URL=http://localhost:3401

47
ingestion/.gitignore vendored Normal file
View File

@@ -0,0 +1,47 @@
# Binaries
bin/
*.exe
*.test
*.out
# Go workspace
go.work
go.work.sum
# Node
node_modules/
dist/
build/
.next/
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
# Env
.env
.env.local
.env.*.local
# IDE
.vscode/
.idea/
*.swp
.DS_Store
# Coverage
coverage/
*.cover
*.coverage
coverage.out
# Docker volumes
infra/docker/data/
# Console build output
console/dist/
console/.vite/
# Rotor
rotor/dist/

View File

@@ -0,0 +1,300 @@
# CLAUDE.md — CDP Ingestion Service
> You are a senior software engineer building the **Data Ingestion Service** for a self-hosted CDP platform,
> inspired by Jitsu. Focus: event streaming, JS functions, identity stitching.
>
> **Scope boundary**: Ingestion only. Analytics & Customer 360 live in a separate service (`cdp-analytics`).
---
## What This Service Does
Collects events from any source → validates, deduplicates, transforms via JS Functions → stores in ClickHouse
and exports to external warehouses. Segment-compatible API for easy migration.
---
## Repository Layout
```
cdp-ingestion/
├── ingest/ # Go — HTTP API, auth, validate, dedup, push to Kafka (port 3049)
├── rotor/ # Node.js — JS functions runner, V8 isolate (port 3401)
├── bulker/ # Go — Kafka consumer, batch write to ClickHouse/warehouses (port 3042)
├── console/ # React + Vite + shadcn/ui + Tailwind — management UI (port 3000)
└── infra/
├── docker/
├── clickhouse/ # ClickHouse DDL / migrations
└── migrations/ # PostgreSQL migrations (golang-migrate)
```
---
## Tech Stack
### Go Services (ingest, bulker)
| Concern | Library | Notes |
|---------|---------|-------|
| HTTP router | `chi` | Lightweight, stdlib-compatible middleware |
| Logger | `zap` | Structured logging, fastest |
| PostgreSQL | `pgx/v5` | Native driver, no database/sql wrapper |
| Kafka | `franz-go` | Pure Go, no CGO, best Redpanda support |
| Redis | `rueidis` | Modern client, faster than go-redis |
| Config | `caarlos0/env` | Parse env vars into structs, zero deps |
| Validation | `go-playground/validator/v10` | Struct tags validation |
| Migration | `golang-migrate` + pgx driver | CLI only — never auto-migrate on startup |
| Test assertion | `testify` | assert + require + mock |
| Integration test | `testcontainers-go` | Real PG / Redis / ClickHouse in tests |
### React Console (console/)
| Concern | Library |
|---------|---------|
| Build | Vite |
| UI components | shadcn/ui + Tailwind |
| Routing | React Router v6 |
| Server state | TanStack Query |
| Client state | Zustand |
| Forms | react-hook-form + zod |
| Charts | Recharts |
| Icons | lucide-react |
> **No new technology** without discussion. All additions must justify why existing stack cannot handle it.
---
## Go Project Structure
Every Go service follows this layout:
```
ingest/
├── cmd/
│ └── server/
│ └── main.go # wire everything, start server
└── internal/
├── handler/ # HTTP handlers — parse request, call service, write response
├── service/ # business logic — no HTTP, no DB concerns
├── repo/ # DB queries — PostgreSQL via pgx, ClickHouse
├── kafka/ # producer (ingest) / consumer (bulker)
├── middleware/ # auth, rate limit, request ID, logging
└── config/ # env parsing via caarlos0/env
```
Rules:
- `handler` depends on `service`. `service` depends on `repo`. Never reverse.
- `handler` never touches DB directly.
- `service` never imports `chi` or any HTTP package.
- `repo` returns domain types, never raw `pgx.Rows`.
---
## Error Handling
Use `AppError` for all domain errors. Never return raw `pgx` or stdlib errors to handlers.
```go
// internal/apperr/apperr.go
type AppError struct {
Code int // HTTP status code to return
Message string // user-facing message (safe to expose)
Field string // optional: which field caused the error (schema conflict, validation)
Err error // original error for logging (not exposed to user)
}
func (e *AppError) Error() string { return e.Message }
func (e *AppError) Unwrap() error { return e.Err }
// Constructors
func BadRequest(msg, field string, err error) *AppError
func Conflict(msg string, err error) *AppError
func TooManyRequests(retryAfter int) *AppError
func UnprocessableEntity(msg string) *AppError
func Internal(err error) *AppError
```
Handler pattern — one place to handle all errors:
```go
func writeError(w http.ResponseWriter, err error) {
var appErr *apperr.AppError
if errors.As(err, &appErr) {
// log appErr.Err internally, return appErr.Message to user
render.JSON(w, appErr.Code, ErrorResponse{Error: appErr.Message, Field: appErr.Field})
return
}
// unexpected — log full error, return generic 500
render.JSON(w, 500, ErrorResponse{Error: "internal server error"})
}
```
---
## Testing Strategy
### Unit tests — handler + service layer
- Mock interfaces with `testify/mock`
- No real DB, no real Redis, no real Kafka
- File: `foo_test.go` alongside the file being tested
```go
type EventServiceMock struct { mock.Mock }
func (m *EventServiceMock) Track(ctx context.Context, e *Event) error {
return m.Called(ctx, e).Error(0)
}
```
### Integration tests — repo layer only
- Use `testcontainers-go` to spin up real PostgreSQL, Redis, ClickHouse
- File: `internal/repo/event_repo_test.go`
- Tag: `//go:build integration`
- Run: `make test/integration`
```bash
make test # unit only (fast, no containers)
make test/integration # repo layer with real DBs (slower, CI)
```
---
## Migration Workflow
```bash
# Create new migration
make migrate/new name=add_segment_memberships
# Apply
make migrate/up
# Rollback one step
make migrate/down
# Check status
make migrate/status
```
- Migration files live in `infra/migrations/`
- Format: `{version}_{name}.up.sql` + `{version}_{name}.down.sql`
- **Never** auto-run migrations on server startup
- Every PostgreSQL schema change **must** have a migration file — no exceptions
---
## Ingest Pipeline (Step-by-Step)
```
HTTP Request
1. Auth — Write Key → PostgreSQL lookup, cached in Redis (TTL 3060s + pub/sub invalidation)
2. Payload validate — size ≤ PAYLOAD_LIMIT_KB (default 100KB), struct + validator tags
3. Rate limit — Redis sliding window per workspace_id; 429 + Retry-After on breach
4. Timestamp — received_at = server time; client time preserved as sent_at
5. Late event check — (received_at sent_at) > 24h → 422 drop
6. Deduplication — Redis SETNX message_id, TTL 24h
7. JSON flatten — {"a":{"b":1}} → {"a_b":1}
8. Schema validate — type conflict → 400 + field name → push to DLQ
9. Push Kafka — partition key = anonymous_id (ordering for identity stitching)
10. Return 200 OK — fire-and-forget, do not wait for Kafka ack
```
## Kafka Topics
| Topic | Purpose |
|-------|---------|
| `events.ingest` | Happy path — valid events |
| `events.dlq` | Failed events — schema conflict, validation error, function crash |
| `events.retry` | Replay from DLQ after fix |
---
## Key Design Decisions
| Problem | Decision |
|---------|---------|
| Late events (> 24h) | Drop, return `422 Unprocessable` |
| Schema conflict | Reject `400`, include field name in response, push to DLQ |
| Timestamp authority | Server wins (`received_at`); client time kept as `sent_at` |
| Payload limit | Configurable, default 100KB; batch has separate limit |
| Partition key | `anonymous_id` — guarantees ordering for identity stitching |
| Enrich mode | Async by default — store raw event first, worker enriches after |
| Identity backfill | Async + lock per `anonymous_id` to avoid race condition |
| Write Key cache | Redis TTL 3060s + pub/sub invalidation on revoke |
| Graceful shutdown | Drain in-flight requests on SIGTERM before exit |
| Migration | CLI only — never auto-migrate on startup |
## Rate Limits
| Tier | RPS | Events/day | Burst (5s) |
|------|-----|-----------|-----------|
| Default | 100 | 1M | 500 |
| Pro | 500 | 10M | 2,500 |
| Enterprise | custom | custom | custom |
Rate limit key: `rate:{workspace_id}` — per workspace, not per IP.
---
## Data Conventions
- **Field names**: `snake_case`; sanitize on ingest (remove spaces, special chars)
- **Timestamps**: `received_at` (server), `sent_at` (client), `timestamp` (event time for analytics)
- **Dedup key**: `message_id`, Redis SETNX, TTL 24h
- **Nested objects**: auto-flatten before schema check
- **Type coercion**: none — type conflict → reject immediately
- **Write Key**: never log raw; always masked in logs
---
## Logging Policy (zap)
```
Happy path → metadata only, no payload LOG_PAYLOAD_ON_SUCCESS=false (default)
Error/reject → full payload logged LOG_PAYLOAD_ON_ERROR=true (default)
Write Key → always masked, never raw
```
Fields logged on every request: `workspace_id`, `source_id`, `message_id`, `event_type`, `duration_ms`, `status_code`.
---
## API Endpoints (Ingest)
| Method | Path | Description |
|--------|------|-------------|
| `POST` | `/track` | Single event |
| `POST` | `/batch` | Batch events (Segment-compatible) |
| `POST` | `/identify` | Identify call |
| `POST` | `/page` | Page call |
| `POST` | `/group` | Group call |
| `GET` | `/health` | Health check |
| `GET` | `/ready` | Readiness check |
Every endpoint must have a request struct with `validate` tags. Validation runs before any business logic.
---
## Coding Rules
- **Do not write code unless asked** — discuss architecture/features first
- **Ask when scope is unclear**, especially when multiple valid approaches exist
- **YAGNI + KISS** — do not build what is not needed yet
- **Correctness before performance** — optimize only when profiling proves it necessary
- **Every ClickHouse schema change must have a migration file** in `infra/clickhouse/`
- **Every PostgreSQL schema change must have a migration file** in `infra/migrations/`
- **Every API endpoint must have a request struct with `validate` tags**
- **Never write raw events from analytics layer** — ingestion is the sole writer
- Discuss in **Vietnamese**, write code and comments in **English**
---
## Common Pitfalls
- Do not skip dedup check even for bulk imports — use a different TTL bucket if needed
- Do not change partition key from `anonymous_id` — breaks identity stitching ordering
- Do not cache Write Keys without the pub/sub invalidation path — revoked keys must propagate within TTL
- `rotor` is Node.js, not Go — cross-service calls go over HTTP, never in-process
- DLQ events must be replayable — never mutate DLQ topic; write to `events.retry` for replay
- Do not return raw `pgx` errors to HTTP handlers — always wrap with `AppError`
- Do not run migrations on server startup — use `make migrate/up` explicitly
- `service` layer must never import `net/http` or `chi` — keep HTTP concerns in `handler` only

137
ingestion/Makefile Normal file
View File

@@ -0,0 +1,137 @@
.PHONY: help up down logs migrate/new migrate/up migrate/down migrate/status \
clickhouse/up clickhouse/down \
run/ingest run/bulker run/rotor run/console \
build/ingest build/bulker \
test test/integration \
lint fmt tidy
# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------
POSTGRES_DSN ?= postgres://cdp:cdp@localhost:5432/cdp?sslmode=disable
CLICKHOUSE_DSN ?= clickhouse://default:@localhost:9000/cdp
MIGRATE_BIN ?= migrate
MIGRATIONS_DIR := infra/migrations
CLICKHOUSE_DIR := infra/clickhouse
# ---------------------------------------------------------------------------
# Help
# ---------------------------------------------------------------------------
help:
@echo "CDP Ingestion - common tasks"
@echo ""
@echo " make up docker-compose up infra (Postgres, Redis, Kafka, ClickHouse)"
@echo " make down docker-compose down"
@echo " make logs tail logs"
@echo ""
@echo " make migrate/new name=X create new PG migration"
@echo " make migrate/up apply PG migrations"
@echo " make migrate/down rollback one"
@echo " make migrate/status migration status"
@echo ""
@echo " make clickhouse/up apply ClickHouse DDL"
@echo " make clickhouse/down drop ClickHouse schema"
@echo ""
@echo " make run/ingest run ingest service (port 3049)"
@echo " make run/bulker run bulker service (port 3042)"
@echo " make run/rotor run rotor service (port 3401)"
@echo " make run/console run console UI (port 3000)"
@echo ""
@echo " make test unit tests"
@echo " make test/integration integration tests (testcontainers)"
# ---------------------------------------------------------------------------
# Docker
# ---------------------------------------------------------------------------
up:
docker compose -f infra/docker/docker-compose.yml up -d
down:
docker compose -f infra/docker/docker-compose.yml down
logs:
docker compose -f infra/docker/docker-compose.yml logs -f --tail=200
# ---------------------------------------------------------------------------
# PostgreSQL migrations
# ---------------------------------------------------------------------------
migrate/new:
@if [ -z "$(name)" ]; then echo "usage: make migrate/new name=add_xxx"; exit 1; fi
$(MIGRATE_BIN) create -ext sql -dir $(MIGRATIONS_DIR) -seq $(name)
migrate/up:
$(MIGRATE_BIN) -path $(MIGRATIONS_DIR) -database "$(POSTGRES_DSN)" up
migrate/down:
$(MIGRATE_BIN) -path $(MIGRATIONS_DIR) -database "$(POSTGRES_DSN)" down 1
migrate/status:
$(MIGRATE_BIN) -path $(MIGRATIONS_DIR) -database "$(POSTGRES_DSN)" version
# ---------------------------------------------------------------------------
# ClickHouse DDL
# ---------------------------------------------------------------------------
clickhouse/up:
@bash infra/scripts/clickhouse_apply.sh up
clickhouse/down:
@bash infra/scripts/clickhouse_apply.sh down
# ---------------------------------------------------------------------------
# Run services
# ---------------------------------------------------------------------------
run/ingest:
cd ingest && go run ./cmd/server
run/bulker:
cd bulker && go run ./cmd/server
run/rotor:
cd rotor && npm run dev
run/console:
cd console && npm run dev
# ---------------------------------------------------------------------------
# Build
# ---------------------------------------------------------------------------
build/ingest:
cd ingest && CGO_ENABLED=0 go build -o ../bin/ingest ./cmd/server
build/bulker:
cd bulker && CGO_ENABLED=0 go build -o ../bin/bulker ./cmd/server
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
test:
cd ingest && go test ./... -count=1
cd bulker && go test ./... -count=1
test/integration:
cd ingest && go test -tags=integration ./... -count=1 -timeout=5m
cd bulker && go test -tags=integration ./... -count=1 -timeout=5m
# ---------------------------------------------------------------------------
# Code quality
# ---------------------------------------------------------------------------
lint:
cd ingest && golangci-lint run ./...
cd bulker && golangci-lint run ./...
fmt:
cd ingest && gofmt -w .
cd bulker && gofmt -w .
tidy:
cd ingest && go mod tidy
cd bulker && go mod tidy

33
ingestion/README.md Normal file
View File

@@ -0,0 +1,33 @@
# CDP Ingestion
Self-hosted CDP ingestion platform inspired by Jitsu. Segment-compatible HTTP API.
## Services
| Service | Lang | Port | Role |
|---------|------|------|------|
| `ingest` | Go | 3049 | HTTP API → auth, validate, dedup, push to Kafka |
| `rotor` | Node.js | 3401 | JS functions runner (V8 isolates) |
| `bulker` | Go | 3042 | Kafka consumer → batch write ClickHouse / warehouses |
| `console` | React + Vite | 3000 | Management UI |
## Quick start
```bash
make up # docker-compose up infra (Postgres, Redis, Kafka, ClickHouse)
make migrate/up # apply PostgreSQL migrations
make clickhouse/up # apply ClickHouse DDL
make run/ingest # start ingest on :3049
make run/bulker # start bulker on :3042
make run/rotor # start rotor on :3401
make run/console # start console on :3000
```
## Testing
```bash
make test # unit tests (no containers)
make test/integration # repo-layer integration tests (testcontainers)
```
See [CLAUDE_ingestion.md](./CLAUDE_ingestion.md) for the full design contract.

View File

@@ -0,0 +1,12 @@
FROM golang:1.22-alpine AS build
WORKDIR /src
COPY go.mod go.sum* ./
RUN go mod download || true
COPY . .
RUN CGO_ENABLED=0 go build -trimpath -ldflags="-s -w" -o /out/bulker ./cmd/server
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=build /out/bulker /bulker
EXPOSE 3042
USER nonroot:nonroot
ENTRYPOINT ["/bulker"]

View File

@@ -0,0 +1,136 @@
// Command server runs the CDP bulker -- Kafka consumer that batches events
// into ClickHouse (and other warehouses, when configured).
package main
import (
"context"
"encoding/json"
"errors"
"log"
"net/http"
"os"
"os/signal"
"syscall"
"time"
"github.com/go-chi/chi/v5"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/bulker/internal/batcher"
"github.com/dbiz/cdp/ingestion/bulker/internal/config"
"github.com/dbiz/cdp/ingestion/bulker/internal/consumer"
"github.com/dbiz/cdp/ingestion/bulker/internal/model"
"github.com/dbiz/cdp/ingestion/bulker/internal/writer"
)
func main() {
if err := run(); err != nil {
log.Fatal(err)
}
}
func run() error {
cfg, err := config.Load()
if err != nil {
return err
}
logger, err := newLogger(cfg.LogLevel)
if err != nil {
return err
}
defer func() { _ = logger.Sync() }()
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// ---- ClickHouse -------------------------------------------------------
ch, err := writer.New(ctx, cfg.ClickHouseAddr, cfg.ClickHouseDB, cfg.ClickHouseUser, cfg.ClickHousePassword)
if err != nil {
return err
}
defer func() { _ = ch.Close() }()
// ---- Batcher ---------------------------------------------------------
b := batcher.New(cfg.BatchSize, cfg.BatchInterval,
func(ctx context.Context, evs []*model.IngestedEvent) error {
_, err := ch.WriteEvents(ctx, evs)
return err
}, logger)
go b.Run(ctx)
// ---- Consumer --------------------------------------------------------
cons, err := consumer.New(consumer.Config{
Brokers: cfg.KafkaBrokers,
Group: cfg.KafkaGroup,
IngestTopic: cfg.KafkaTopicIngest,
DLQTopic: cfg.KafkaTopicDLQ,
}, b, ch, logger)
if err != nil {
return err
}
defer cons.Close()
consumerErr := make(chan error, 1)
go func() { consumerErr <- cons.Run(ctx) }()
// ---- HTTP (health) ---------------------------------------------------
r := chi.NewRouter()
r.Get("/health", func(w http.ResponseWriter, _ *http.Request) {
writeJSON(w, http.StatusOK, map[string]string{"status": "ok"})
})
r.Get("/ready", func(w http.ResponseWriter, _ *http.Request) {
writeJSON(w, http.StatusOK, map[string]string{"status": "ready"})
})
srv := &http.Server{
Addr: cfg.HTTPAddr,
Handler: r,
ReadHeaderTimeout: 5 * time.Second,
}
httpErr := make(chan error, 1)
go func() {
logger.Info("bulker http listening", zap.String("addr", cfg.HTTPAddr))
if err := srv.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
httpErr <- err
}
}()
// ---- Signals ---------------------------------------------------------
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
select {
case <-sigCh:
logger.Info("shutdown signal received")
case err := <-consumerErr:
logger.Error("consumer stopped unexpectedly", zap.Error(err))
case err := <-httpErr:
logger.Error("http stopped unexpectedly", zap.Error(err))
}
shutCtx, shutCancel := context.WithTimeout(context.Background(), cfg.ShutdownTimeout)
defer shutCancel()
cancel() // stop consumer + batcher
_ = srv.Shutdown(shutCtx)
_ = b.FlushNow(shutCtx)
return nil
}
func writeJSON(w http.ResponseWriter, status int, body any) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(status)
_ = json.NewEncoder(w).Encode(body)
}
func newLogger(level string) (*zap.Logger, error) {
lvl, err := zap.ParseAtomicLevel(level)
if err != nil {
lvl = zap.NewAtomicLevelAt(zap.InfoLevel)
}
cfg := zap.NewProductionConfig()
cfg.Level = lvl
cfg.EncoderConfig.TimeKey = "ts"
cfg.EncoderConfig.MessageKey = "msg"
return cfg.Build()
}

12
ingestion/bulker/go.mod Normal file
View File

@@ -0,0 +1,12 @@
module github.com/dbiz/cdp/ingestion/bulker
go 1.22
require (
github.com/ClickHouse/clickhouse-go/v2 v2.30.0
github.com/caarlos0/env/v11 v11.2.2
github.com/go-chi/chi/v5 v5.1.0
github.com/stretchr/testify v1.9.0
github.com/twmb/franz-go v1.17.1
go.uber.org/zap v1.27.0
)

View File

@@ -0,0 +1,107 @@
// Package batcher accumulates events from the consumer until either the
// size cap or the time cap is hit, then flushes them to the writer.
//
// Flush semantics:
// - on size cap: flush immediately
// - on time cap: flush whatever is buffered (even 0 events: no-op)
// - on shutdown: flush whatever is buffered, then return
package batcher
import (
"context"
"sync"
"time"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/bulker/internal/model"
)
type FlushFunc func(ctx context.Context, events []*model.IngestedEvent) error
type Batcher struct {
size int
interval time.Duration
flush FlushFunc
log *zap.Logger
mu sync.Mutex
buffer []*model.IngestedEvent
}
func New(size int, interval time.Duration, flush FlushFunc, log *zap.Logger) *Batcher {
return &Batcher{
size: size,
interval: interval,
flush: flush,
log: log,
buffer: make([]*model.IngestedEvent, 0, size),
}
}
// Add appends an event. If the size cap is reached we flush synchronously
// before returning so the consumer commit can rely on durability.
func (b *Batcher) Add(ctx context.Context, e *model.IngestedEvent) error {
b.mu.Lock()
b.buffer = append(b.buffer, e)
if len(b.buffer) < b.size {
b.mu.Unlock()
return nil
}
batch := b.swap()
b.mu.Unlock()
return b.doFlush(ctx, batch)
}
// Run blocks until ctx is cancelled, flushing the buffer every interval.
func (b *Batcher) Run(ctx context.Context) {
t := time.NewTicker(b.interval)
defer t.Stop()
for {
select {
case <-ctx.Done():
b.FlushNow(context.Background())
return
case <-t.C:
if err := b.FlushNow(ctx); err != nil {
b.log.Warn("batch flush failed", zap.Error(err))
}
}
}
}
// FlushNow swaps the buffer and flushes synchronously.
func (b *Batcher) FlushNow(ctx context.Context) error {
b.mu.Lock()
batch := b.swap()
b.mu.Unlock()
return b.doFlush(ctx, batch)
}
// swap returns the current buffer and replaces it with a fresh slice.
// Caller must hold b.mu.
func (b *Batcher) swap() []*model.IngestedEvent {
if len(b.buffer) == 0 {
return nil
}
out := b.buffer
b.buffer = make([]*model.IngestedEvent, 0, b.size)
return out
}
func (b *Batcher) doFlush(ctx context.Context, batch []*model.IngestedEvent) error {
if len(batch) == 0 {
return nil
}
start := time.Now()
if err := b.flush(ctx, batch); err != nil {
b.log.Error("flush failed",
zap.Int("count", len(batch)),
zap.Error(err))
return err
}
b.log.Info("flushed",
zap.Int("count", len(batch)),
zap.Int64("duration_ms", time.Since(start).Milliseconds()))
return nil
}

View File

@@ -0,0 +1,56 @@
package batcher
import (
"context"
"sync/atomic"
"testing"
"time"
"github.com/stretchr/testify/assert"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/bulker/internal/model"
)
func TestBatcher_FlushesOnSizeCap(t *testing.T) {
var flushed int32
flush := func(_ context.Context, evs []*model.IngestedEvent) error {
atomic.AddInt32(&flushed, int32(len(evs)))
return nil
}
b := New(3, time.Hour, flush, zap.NewNop())
for i := 0; i < 3; i++ {
_ = b.Add(context.Background(), &model.IngestedEvent{MessageID: "x"})
}
assert.Equal(t, int32(3), atomic.LoadInt32(&flushed))
}
func TestBatcher_FlushNow_NoOpOnEmpty(t *testing.T) {
var called int32
flush := func(_ context.Context, _ []*model.IngestedEvent) error {
atomic.AddInt32(&called, 1)
return nil
}
b := New(10, time.Hour, flush, zap.NewNop())
_ = b.FlushNow(context.Background())
assert.Equal(t, int32(0), atomic.LoadInt32(&called))
}
func TestBatcher_FlushesOnTimer(t *testing.T) {
var flushed int32
flush := func(_ context.Context, evs []*model.IngestedEvent) error {
atomic.AddInt32(&flushed, int32(len(evs)))
return nil
}
b := New(1000, 50*time.Millisecond, flush, zap.NewNop())
ctx, cancel := context.WithCancel(context.Background())
go b.Run(ctx)
_ = b.Add(context.Background(), &model.IngestedEvent{MessageID: "a"})
_ = b.Add(context.Background(), &model.IngestedEvent{MessageID: "b"})
time.Sleep(120 * time.Millisecond)
cancel()
time.Sleep(10 * time.Millisecond)
assert.Equal(t, int32(2), atomic.LoadInt32(&flushed))
}

View File

@@ -0,0 +1,35 @@
// Package config loads bulker runtime config from env.
package config
import (
"fmt"
"time"
"github.com/caarlos0/env/v11"
)
type Config struct {
HTTPAddr string `env:"BULKER_HTTP_ADDR" envDefault:":3042"`
LogLevel string `env:"BULKER_LOG_LEVEL" envDefault:"info"`
KafkaGroup string `env:"BULKER_KAFKA_GROUP" envDefault:"bulker"`
BatchSize int `env:"BULKER_BATCH_SIZE" envDefault:"1000"`
BatchInterval time.Duration `env:"BULKER_BATCH_INTERVAL_SECONDS" envDefault:"5s"`
ShutdownTimeout time.Duration `env:"BULKER_SHUTDOWN_TIMEOUT_SECONDS" envDefault:"60s"`
KafkaBrokers []string `env:"KAFKA_BROKERS" envSeparator:"," envDefault:"localhost:9092"`
KafkaTopicIngest string `env:"KAFKA_TOPIC_INGEST" envDefault:"events.ingest"`
KafkaTopicDLQ string `env:"KAFKA_TOPIC_DLQ" envDefault:"events.dlq"`
ClickHouseAddr string `env:"CLICKHOUSE_ADDR" envDefault:"localhost:9000"`
ClickHouseDB string `env:"CLICKHOUSE_DB" envDefault:"cdp"`
ClickHouseUser string `env:"CLICKHOUSE_USER" envDefault:"default"`
ClickHousePassword string `env:"CLICKHOUSE_PASSWORD" envDefault:""`
}
func Load() (*Config, error) {
cfg := &Config{}
if err := env.Parse(cfg); err != nil {
return nil, fmt.Errorf("config load: %w", err)
}
return cfg, nil
}

View File

@@ -0,0 +1,130 @@
// Package consumer reads from the ingest Kafka topics and feeds the batcher.
//
// We use franz-go's manual commit mode: commit only after a successful
// batcher flush. Combined with at-least-once semantics from the producer
// and idempotent inserts at the analytics layer this is sufficient.
package consumer
import (
"context"
"encoding/json"
"fmt"
"github.com/twmb/franz-go/pkg/kgo"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/bulker/internal/batcher"
"github.com/dbiz/cdp/ingestion/bulker/internal/model"
"github.com/dbiz/cdp/ingestion/bulker/internal/writer"
)
type Consumer struct {
client *kgo.Client
log *zap.Logger
batcher *batcher.Batcher
writer *writer.ClickHouse
dlqTopic string
ingestTopic string
}
type Config struct {
Brokers []string
Group string
IngestTopic string
DLQTopic string
}
func New(c Config, b *batcher.Batcher, w *writer.ClickHouse, log *zap.Logger) (*Consumer, error) {
cl, err := kgo.NewClient(
kgo.SeedBrokers(c.Brokers...),
kgo.ConsumerGroup(c.Group),
kgo.ConsumeTopics(c.IngestTopic, c.DLQTopic),
kgo.DisableAutoCommit(),
kgo.ClientID("cdp-bulker"),
kgo.SessionTimeout(45_000_000_000), // 45s
kgo.FetchMaxBytes(50 * 1024 * 1024),
)
if err != nil {
return nil, fmt.Errorf("kafka client: %w", err)
}
return &Consumer{
client: cl,
log: log,
batcher: b,
writer: w,
dlqTopic: c.DLQTopic,
ingestTopic: c.IngestTopic,
}, nil
}
func (c *Consumer) Close() {
c.client.Close()
}
// Run polls Kafka until ctx is cancelled. One iteration:
// 1. PollFetches
// 2. For each record, parse JSON and route to the right destination
// 3. Commit offsets only after the batch flush succeeded
func (c *Consumer) Run(ctx context.Context) error {
for {
select {
case <-ctx.Done():
c.log.Info("consumer stopping")
return nil
default:
}
fetches := c.client.PollFetches(ctx)
if errs := fetches.Errors(); len(errs) > 0 {
for _, e := range errs {
c.log.Warn("fetch error",
zap.String("topic", e.Topic),
zap.Int32("partition", e.Partition),
zap.Error(e.Err))
}
}
var dlqBatch []*model.DLQRecord
fetches.EachRecord(func(r *kgo.Record) {
switch r.Topic {
case c.ingestTopic:
var ev model.IngestedEvent
if err := json.Unmarshal(r.Value, &ev); err != nil {
c.log.Warn("ingest decode failed",
zap.String("topic", r.Topic),
zap.Error(err))
return
}
if err := c.batcher.Add(ctx, &ev); err != nil {
c.log.Error("batcher add failed", zap.Error(err))
}
case c.dlqTopic:
var d model.DLQRecord
if err := json.Unmarshal(r.Value, &d); err != nil {
return
}
if d.ReceivedAt.IsZero() {
d.ReceivedAt = r.Timestamp
}
dlqBatch = append(dlqBatch, &d)
}
})
if len(dlqBatch) > 0 {
if err := c.writer.WriteDLQ(ctx, dlqBatch); err != nil {
c.log.Error("dlq write failed", zap.Error(err))
}
}
// Force a flush before committing so committed offsets reflect what's
// actually persisted. The batcher is idempotent for empty buffers.
if err := c.batcher.FlushNow(ctx); err != nil {
c.log.Warn("flush before commit failed", zap.Error(err))
continue // do not commit -- retry on next poll
}
if err := c.client.CommitUncommittedOffsets(ctx); err != nil {
c.log.Warn("commit failed", zap.Error(err))
}
}
}

View File

@@ -0,0 +1,41 @@
package model
import "time"
// IngestedEvent mirrors the shape ingest publishes onto Kafka.
// Keep these two structs in lock-step (we are intentionally NOT importing
// ingest's package -- bulker compiles standalone).
type IngestedEvent struct {
WorkspaceID string `json:"workspace_id"`
SourceID string `json:"source_id"`
MessageID string `json:"message_id"`
Type string `json:"type"`
AnonymousID string `json:"anonymous_id,omitempty"`
UserID string `json:"user_id,omitempty"`
GroupID string `json:"group_id,omitempty"`
Event string `json:"event,omitempty"`
Name string `json:"name,omitempty"`
Category string `json:"category,omitempty"`
Properties map[string]any `json:"properties,omitempty"`
Traits map[string]any `json:"traits,omitempty"`
Context map[string]any `json:"context,omitempty"`
IP string `json:"ip,omitempty"`
UserAgent string `json:"user_agent,omitempty"`
Timestamp time.Time `json:"timestamp"`
SentAt time.Time `json:"sent_at"`
ReceivedAt time.Time `json:"received_at"`
}
// DLQRecord is the JSON shape the bulker reads from the DLQ topic.
type DLQRecord struct {
WorkspaceID string `json:"workspace_id"`
SourceID string `json:"source_id"`
MessageID string `json:"message_id"`
Reason string `json:"reason"`
Field string `json:"field"`
RawPayload string `json:"raw_payload"`
ReceivedAt time.Time `json:"received_at"`
}

View File

@@ -0,0 +1,250 @@
// Package writer wraps the ClickHouse client for batch inserts.
//
// We use the native clickhouse-go v2 client. One PrepareBatch / Append / Send
// cycle per (table, batch). All maps are stringified before insertion -- the
// ClickHouse schema uses Map(String, String) which keeps the table flat and
// avoids column explosion. Analytics queries cast on read.
package writer
import (
"context"
"encoding/json"
"fmt"
"strconv"
"github.com/ClickHouse/clickhouse-go/v2"
"github.com/ClickHouse/clickhouse-go/v2/lib/driver"
"github.com/dbiz/cdp/ingestion/bulker/internal/model"
)
type ClickHouse struct {
conn driver.Conn
db string
}
func New(ctx context.Context, addr, db, user, password string) (*ClickHouse, error) {
conn, err := clickhouse.Open(&clickhouse.Options{
Addr: []string{addr},
Auth: clickhouse.Auth{
Database: db,
Username: user,
Password: password,
},
Settings: clickhouse.Settings{
"async_insert": 0,
"wait_for_async_insert": 0,
},
})
if err != nil {
return nil, fmt.Errorf("clickhouse open: %w", err)
}
if err := conn.Ping(ctx); err != nil {
return nil, fmt.Errorf("clickhouse ping: %w", err)
}
return &ClickHouse{conn: conn, db: db}, nil
}
func (c *ClickHouse) Close() error { return c.conn.Close() }
// WriteEvents fans out a mixed-type batch into the per-type tables.
// Returns the number of rows successfully inserted across all tables.
func (c *ClickHouse) WriteEvents(ctx context.Context, events []*model.IngestedEvent) (int, error) {
if len(events) == 0 {
return 0, nil
}
// Bucket by event type so each insert hits one table.
buckets := map[string][]*model.IngestedEvent{}
for _, e := range events {
buckets[e.Type] = append(buckets[e.Type], e)
}
total := 0
for t, evs := range buckets {
var err error
switch t {
case "track":
err = c.writeTrack(ctx, evs)
case "identify":
err = c.writeIdentify(ctx, evs)
case "page", "screen":
err = c.writePage(ctx, evs)
case "group":
err = c.writeGroup(ctx, evs)
default:
// alias / unknown types -- write to track for now
err = c.writeTrack(ctx, evs)
}
if err != nil {
return total, fmt.Errorf("write %s: %w", t, err)
}
total += len(evs)
}
return total, nil
}
// ---------------------------------------------------------------------------
// per-table batch inserts
// ---------------------------------------------------------------------------
func (c *ClickHouse) writeTrack(ctx context.Context, evs []*model.IngestedEvent) error {
batch, err := c.conn.PrepareBatch(ctx, "INSERT INTO events_track")
if err != nil {
return err
}
for _, e := range evs {
err := batch.Append(
e.WorkspaceID, e.SourceID, e.MessageID,
e.AnonymousID, e.UserID, e.Event,
e.Timestamp, e.SentAt, e.ReceivedAt,
mapToStr(e.Properties), mapToStr(e.Context),
e.IP, e.UserAgent,
libraryName(e.Context), libraryVersion(e.Context),
)
if err != nil {
return err
}
}
return batch.Send()
}
func (c *ClickHouse) writeIdentify(ctx context.Context, evs []*model.IngestedEvent) error {
batch, err := c.conn.PrepareBatch(ctx, "INSERT INTO events_identify")
if err != nil {
return err
}
for _, e := range evs {
err := batch.Append(
e.WorkspaceID, e.SourceID, e.MessageID,
e.AnonymousID, e.UserID,
e.Timestamp, e.SentAt, e.ReceivedAt,
mapToStr(e.Traits), mapToStr(e.Context),
e.IP, e.UserAgent,
)
if err != nil {
return err
}
}
return batch.Send()
}
func (c *ClickHouse) writePage(ctx context.Context, evs []*model.IngestedEvent) error {
batch, err := c.conn.PrepareBatch(ctx, "INSERT INTO events_page")
if err != nil {
return err
}
for _, e := range evs {
path, _ := e.Properties["path"].(string)
url, _ := e.Properties["url"].(string)
referrer, _ := e.Properties["referrer"].(string)
err := batch.Append(
e.WorkspaceID, e.SourceID, e.MessageID,
e.AnonymousID, e.UserID, e.Name, e.Category,
e.Timestamp, e.SentAt, e.ReceivedAt,
mapToStr(e.Properties), mapToStr(e.Context),
e.IP, e.UserAgent,
referrer, path, url,
)
if err != nil {
return err
}
}
return batch.Send()
}
func (c *ClickHouse) writeGroup(ctx context.Context, evs []*model.IngestedEvent) error {
batch, err := c.conn.PrepareBatch(ctx, "INSERT INTO events_group")
if err != nil {
return err
}
for _, e := range evs {
err := batch.Append(
e.WorkspaceID, e.SourceID, e.MessageID,
e.AnonymousID, e.UserID, e.GroupID,
e.Timestamp, e.SentAt, e.ReceivedAt,
mapToStr(e.Traits), mapToStr(e.Context),
e.IP, e.UserAgent,
)
if err != nil {
return err
}
}
return batch.Send()
}
// WriteDLQ inserts records from the DLQ topic.
func (c *ClickHouse) WriteDLQ(ctx context.Context, recs []*model.DLQRecord) error {
if len(recs) == 0 {
return nil
}
batch, err := c.conn.PrepareBatch(ctx, "INSERT INTO events_dlq")
if err != nil {
return err
}
for _, r := range recs {
if err := batch.Append(
r.WorkspaceID, r.SourceID, r.MessageID, r.ReceivedAt,
r.Reason, r.Field, r.RawPayload,
); err != nil {
return err
}
}
return batch.Send()
}
// ---------------------------------------------------------------------------
// helpers
// ---------------------------------------------------------------------------
// mapToStr converts a map[string]any into the Map(String, String) shape
// ClickHouse expects. Non-string values are JSON-encoded.
func mapToStr(in map[string]any) map[string]string {
if in == nil {
return map[string]string{}
}
out := make(map[string]string, len(in))
for k, v := range in {
out[k] = anyToStr(v)
}
return out
}
func anyToStr(v any) string {
switch x := v.(type) {
case nil:
return ""
case string:
return x
case float64:
return strconv.FormatFloat(x, 'f', -1, 64)
case int:
return strconv.Itoa(x)
case int64:
return strconv.FormatInt(x, 10)
case bool:
return strconv.FormatBool(x)
default:
b, _ := json.Marshal(v)
return string(b)
}
}
func libraryName(ctx map[string]any) string {
if ctx == nil {
return ""
}
if v, ok := ctx["library_name"].(string); ok {
return v
}
return ""
}
func libraryVersion(ctx map[string]any) string {
if ctx == nil {
return ""
}
if v, ok := ctx["library_version"].(string); ok {
return v
}
return ""
}

View File

@@ -0,0 +1,11 @@
FROM node:20-alpine AS build
WORKDIR /app
COPY package.json ./
RUN npm install
COPY . .
RUN npm run build
FROM nginx:1.27-alpine
COPY --from=build /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.conf
EXPOSE 3000

View File

@@ -0,0 +1,13 @@
<!doctype html>
<html lang="en" class="h-full">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/favicon.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>CDP Console</title>
</head>
<body class="h-full bg-background text-foreground antialiased">
<div id="root" class="h-full"></div>
<script type="module" src="/src/main.tsx"></script>
</body>
</html>

View File

@@ -0,0 +1,23 @@
server {
listen 3000;
root /usr/share/nginx/html;
index index.html;
location / {
try_files $uri $uri/ /index.html;
}
# Proxy to backends so the SPA can hit /api/ingest, /api/bulker, /api/rotor.
location /api/ingest/ {
rewrite ^/api/ingest/(.*)$ /$1 break;
proxy_pass http://ingest:3049;
}
location /api/bulker/ {
rewrite ^/api/bulker/(.*)$ /$1 break;
proxy_pass http://bulker:3042;
}
location /api/rotor/ {
rewrite ^/api/rotor/(.*)$ /$1 break;
proxy_pass http://rotor:3401;
}
}

View File

@@ -0,0 +1,44 @@
{
"name": "cdp-console",
"version": "0.1.0",
"private": true,
"type": "module",
"scripts": {
"dev": "vite",
"build": "tsc -b && vite build",
"preview": "vite preview --port 3000",
"lint": "eslint ."
},
"dependencies": {
"@radix-ui/react-dialog": "^1.1.2",
"@radix-ui/react-dropdown-menu": "^2.1.2",
"@radix-ui/react-label": "^2.1.0",
"@radix-ui/react-slot": "^1.1.0",
"@radix-ui/react-tabs": "^1.1.1",
"@radix-ui/react-toast": "^1.2.2",
"@tanstack/react-query": "^5.59.16",
"class-variance-authority": "^0.7.0",
"clsx": "^2.1.1",
"lucide-react": "^0.451.0",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"react-hook-form": "^7.53.0",
"react-router-dom": "^6.27.0",
"recharts": "^2.13.0",
"tailwind-merge": "^2.5.4",
"zod": "^3.23.8",
"zustand": "^5.0.0"
},
"devDependencies": {
"@types/node": "^22.7.5",
"@types/react": "^18.3.11",
"@types/react-dom": "^18.3.0",
"@vitejs/plugin-react": "^4.3.2",
"autoprefixer": "^10.4.20",
"eslint": "^9.12.0",
"postcss": "^8.4.47",
"tailwindcss": "^3.4.13",
"typescript": "^5.6.3",
"vite": "^5.4.9"
}
}

View File

@@ -0,0 +1,6 @@
export default {
plugins: {
tailwindcss: {},
autoprefixer: {},
},
};

View File

@@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32"><circle cx="16" cy="16" r="14" fill="#0f172a"/><text x="16" y="21" text-anchor="middle" font-family="Inter, sans-serif" font-size="14" font-weight="700" fill="#fff">cdp</text></svg>

After

Width:  |  Height:  |  Size: 242 B

View File

@@ -0,0 +1,32 @@
import { BrowserRouter, Route, Routes } from 'react-router-dom';
import { QueryClient, QueryClientProvider } from '@tanstack/react-query';
import { AppShell } from '@/components/AppShell';
import { DashboardPage } from '@/pages/Dashboard';
import { SourcesPage } from '@/pages/Sources';
import { DestinationsPage } from '@/pages/Destinations';
import { FunctionsPage } from '@/pages/Functions';
import { LivePage } from '@/pages/Live';
import { SettingsPage } from '@/pages/Settings';
const qc = new QueryClient({
defaultOptions: { queries: { retry: 1, staleTime: 30_000 } },
});
export function App() {
return (
<QueryClientProvider client={qc}>
<BrowserRouter>
<Routes>
<Route element={<AppShell />}>
<Route path="/" element={<DashboardPage />} />
<Route path="/sources" element={<SourcesPage />} />
<Route path="/destinations" element={<DestinationsPage />} />
<Route path="/functions" element={<FunctionsPage />} />
<Route path="/live" element={<LivePage />} />
<Route path="/settings" element={<SettingsPage />} />
</Route>
</Routes>
</BrowserRouter>
</QueryClientProvider>
);
}

View File

@@ -0,0 +1,90 @@
// Thin fetch wrapper. Throws on non-2xx with a structured ApiError.
export class ApiError extends Error {
status: number;
field?: string;
constructor(status: number, message: string, field?: string) {
super(message);
this.status = status;
this.field = field;
}
}
const INGEST_BASE = import.meta.env.VITE_API_BASE_URL ?? '/api/ingest';
const ROTOR_BASE = import.meta.env.VITE_ROTOR_BASE_URL ?? '/api/rotor';
const BULKER_BASE = import.meta.env.VITE_BULKER_BASE_URL ?? '/api/bulker';
async function request<T>(base: string, path: string, init?: RequestInit): Promise<T> {
const res = await fetch(`${base}${path}`, {
...init,
headers: {
'content-type': 'application/json',
...(init?.headers ?? {}),
},
});
const text = await res.text();
const data = text ? safeJSON(text) : undefined;
if (!res.ok) {
const msg = (data as { error?: string })?.error ?? res.statusText;
const field = (data as { field?: string })?.field;
throw new ApiError(res.status, msg, field);
}
return data as T;
}
function safeJSON(text: string): unknown {
try {
return JSON.parse(text);
} catch {
return text;
}
}
// ---------------------------------------------------------------------------
// Ingest API
// ---------------------------------------------------------------------------
export const ingest = {
health: () => request<{ status: string }>(INGEST_BASE, '/health'),
ready: () => request<{ status: string }>(INGEST_BASE, '/ready'),
track: (writeKey: string, body: Record<string, unknown>) =>
request<{ ok: boolean }>(INGEST_BASE, '/track', {
method: 'POST',
headers: { Authorization: `Bearer ${writeKey}` },
body: JSON.stringify(body),
}),
};
// ---------------------------------------------------------------------------
// Rotor API
// ---------------------------------------------------------------------------
export interface RunRequest {
code: string;
event: Record<string, unknown>;
}
export interface RunResponse {
result: unknown;
}
export const rotor = {
run: (body: RunRequest) =>
request<RunResponse>(ROTOR_BASE, '/v1/run', {
method: 'POST',
body: JSON.stringify(body),
}),
upsert: (body: { workspace_id: string; slug: string; code: string }) =>
request<{ ok: boolean }>(ROTOR_BASE, '/v1/functions', {
method: 'POST',
body: JSON.stringify(body),
}),
};
// ---------------------------------------------------------------------------
// Bulker API
// ---------------------------------------------------------------------------
export const bulker = {
health: () => request<{ status: string }>(BULKER_BASE, '/health'),
};

View File

@@ -0,0 +1,50 @@
import { NavLink, Outlet } from 'react-router-dom';
import {
Activity, BarChart3, Code2, Database, Settings, Workflow,
} from 'lucide-react';
import { cn } from '@/lib/utils';
const nav = [
{ to: '/', label: 'Dashboard', icon: BarChart3 },
{ to: '/sources', label: 'Sources', icon: Workflow },
{ to: '/destinations', label: 'Destinations', icon: Database },
{ to: '/functions', label: 'Functions', icon: Code2 },
{ to: '/live', label: 'Live events', icon: Activity },
{ to: '/settings', label: 'Settings', icon: Settings },
];
export function AppShell() {
return (
<div className="flex h-full">
<aside className="w-60 shrink-0 border-r bg-muted/30 p-4">
<div className="mb-6 px-2">
<div className="text-lg font-semibold">CDP Console</div>
<div className="text-xs text-muted-foreground">Ingestion</div>
</div>
<nav className="space-y-1">
{nav.map(({ to, label, icon: Icon }) => (
<NavLink
key={to}
to={to}
end={to === '/'}
className={({ isActive }) =>
cn(
'flex items-center gap-2 rounded-md px-3 py-2 text-sm font-medium',
isActive
? 'bg-primary text-primary-foreground'
: 'text-foreground hover:bg-accent hover:text-accent-foreground',
)
}
>
<Icon className="h-4 w-4" />
{label}
</NavLink>
))}
</nav>
</aside>
<main className="flex-1 overflow-auto p-8">
<Outlet />
</main>
</div>
);
}

View File

@@ -0,0 +1,25 @@
import * as React from 'react';
import { cva, type VariantProps } from 'class-variance-authority';
import { cn } from '@/lib/utils';
const badgeVariants = cva(
'inline-flex items-center rounded-full border px-2.5 py-0.5 text-xs font-semibold transition-colors',
{
variants: {
variant: {
default: 'border-transparent bg-primary text-primary-foreground',
secondary: 'border-transparent bg-muted text-foreground',
destructive: 'border-transparent bg-destructive text-destructive-foreground',
outline: 'text-foreground',
success: 'border-transparent bg-emerald-500 text-white',
},
},
defaultVariants: { variant: 'default' },
},
);
export interface BadgeProps extends React.HTMLAttributes<HTMLDivElement>, VariantProps<typeof badgeVariants> {}
export function Badge({ className, variant, ...props }: BadgeProps) {
return <div className={cn(badgeVariants({ variant }), className)} {...props} />;
}

View File

@@ -0,0 +1,48 @@
import * as React from 'react';
import { Slot } from '@radix-ui/react-slot';
import { cva, type VariantProps } from 'class-variance-authority';
import { cn } from '@/lib/utils';
const buttonVariants = cva(
'inline-flex items-center justify-center whitespace-nowrap rounded-md text-sm font-medium ring-offset-background transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 disabled:pointer-events-none disabled:opacity-50',
{
variants: {
variant: {
default: 'bg-primary text-primary-foreground hover:bg-primary/90',
destructive: 'bg-destructive text-destructive-foreground hover:bg-destructive/90',
outline: 'border border-input bg-background hover:bg-accent hover:text-accent-foreground',
ghost: 'hover:bg-accent hover:text-accent-foreground',
link: 'text-primary underline-offset-4 hover:underline',
},
size: {
default: 'h-10 px-4 py-2',
sm: 'h-9 rounded-md px-3',
lg: 'h-11 rounded-md px-8',
icon: 'h-10 w-10',
},
},
defaultVariants: { variant: 'default', size: 'default' },
},
);
export interface ButtonProps
extends React.ButtonHTMLAttributes<HTMLButtonElement>,
VariantProps<typeof buttonVariants> {
asChild?: boolean;
}
export const Button = React.forwardRef<HTMLButtonElement, ButtonProps>(
({ className, variant, size, asChild = false, ...props }, ref) => {
const Comp = asChild ? Slot : 'button';
return (
<Comp
className={cn(buttonVariants({ variant, size, className }))}
ref={ref}
{...props}
/>
);
},
);
Button.displayName = 'Button';
export { buttonVariants };

View File

@@ -0,0 +1,44 @@
import * as React from 'react';
import { cn } from '@/lib/utils';
export const Card = React.forwardRef<HTMLDivElement, React.HTMLAttributes<HTMLDivElement>>(
({ className, ...props }, ref) => (
<div ref={ref} className={cn('rounded-lg border bg-card text-card-foreground shadow-sm', className)} {...props} />
),
);
Card.displayName = 'Card';
export const CardHeader = React.forwardRef<HTMLDivElement, React.HTMLAttributes<HTMLDivElement>>(
({ className, ...props }, ref) => (
<div ref={ref} className={cn('flex flex-col space-y-1.5 p-6', className)} {...props} />
),
);
CardHeader.displayName = 'CardHeader';
export const CardTitle = React.forwardRef<HTMLDivElement, React.HTMLAttributes<HTMLDivElement>>(
({ className, ...props }, ref) => (
<div ref={ref} className={cn('text-lg font-semibold leading-none tracking-tight', className)} {...props} />
),
);
CardTitle.displayName = 'CardTitle';
export const CardDescription = React.forwardRef<HTMLDivElement, React.HTMLAttributes<HTMLDivElement>>(
({ className, ...props }, ref) => (
<div ref={ref} className={cn('text-sm text-muted-foreground', className)} {...props} />
),
);
CardDescription.displayName = 'CardDescription';
export const CardContent = React.forwardRef<HTMLDivElement, React.HTMLAttributes<HTMLDivElement>>(
({ className, ...props }, ref) => (
<div ref={ref} className={cn('p-6 pt-0', className)} {...props} />
),
);
CardContent.displayName = 'CardContent';
export const CardFooter = React.forwardRef<HTMLDivElement, React.HTMLAttributes<HTMLDivElement>>(
({ className, ...props }, ref) => (
<div ref={ref} className={cn('flex items-center p-6 pt-0', className)} {...props} />
),
);
CardFooter.displayName = 'CardFooter';

View File

@@ -0,0 +1,19 @@
import * as React from 'react';
import { cn } from '@/lib/utils';
export type InputProps = React.InputHTMLAttributes<HTMLInputElement>;
export const Input = React.forwardRef<HTMLInputElement, InputProps>(
({ className, type, ...props }, ref) => (
<input
ref={ref}
type={type}
className={cn(
'flex h-10 w-full rounded-md border border-input bg-background px-3 py-2 text-sm ring-offset-background placeholder:text-muted-foreground focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 disabled:cursor-not-allowed disabled:opacity-50',
className,
)}
{...props}
/>
),
);
Input.displayName = 'Input';

View File

@@ -0,0 +1,46 @@
@tailwind base;
@tailwind components;
@tailwind utilities;
@layer base {
:root {
--background: 0 0% 100%;
--foreground: 222.2 84% 4.9%;
--card: 0 0% 100%;
--card-foreground: 222.2 84% 4.9%;
--primary: 222.2 47.4% 11.2%;
--primary-foreground: 210 40% 98%;
--muted: 210 40% 96.1%;
--muted-foreground: 215.4 16.3% 46.9%;
--accent: 210 40% 96.1%;
--accent-foreground: 222.2 47.4% 11.2%;
--destructive: 0 84.2% 60.2%;
--destructive-foreground: 210 40% 98%;
--border: 214.3 31.8% 91.4%;
--input: 214.3 31.8% 91.4%;
--ring: 222.2 84% 4.9%;
--radius: 0.5rem;
}
.dark {
--background: 222.2 84% 4.9%;
--foreground: 210 40% 98%;
--card: 222.2 84% 4.9%;
--card-foreground: 210 40% 98%;
--primary: 210 40% 98%;
--primary-foreground: 222.2 47.4% 11.2%;
--muted: 217.2 32.6% 17.5%;
--muted-foreground: 215 20.2% 65.1%;
--accent: 217.2 32.6% 17.5%;
--accent-foreground: 210 40% 98%;
--destructive: 0 62.8% 30.6%;
--destructive-foreground: 210 40% 98%;
--border: 217.2 32.6% 17.5%;
--input: 217.2 32.6% 17.5%;
--ring: 212.7 26.8% 83.9%;
}
}
@layer base {
* { @apply border-border; }
body { @apply bg-background text-foreground; }
}

View File

@@ -0,0 +1,6 @@
import { type ClassValue, clsx } from 'clsx';
import { twMerge } from 'tailwind-merge';
export function cn(...inputs: ClassValue[]) {
return twMerge(clsx(inputs));
}

View File

@@ -0,0 +1,10 @@
import React from 'react';
import ReactDOM from 'react-dom/client';
import { App } from './App';
import './index.css';
ReactDOM.createRoot(document.getElementById('root')!).render(
<React.StrictMode>
<App />
</React.StrictMode>,
);

View File

@@ -0,0 +1,81 @@
import { useQuery } from '@tanstack/react-query';
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card';
import { Badge } from '@/components/ui/badge';
import { bulker, ingest } from '@/api/client';
export function DashboardPage() {
const ingestHealth = useQuery({
queryKey: ['health', 'ingest'],
queryFn: ingest.health,
refetchInterval: 5_000,
});
const bulkerHealth = useQuery({
queryKey: ['health', 'bulker'],
queryFn: bulker.health,
refetchInterval: 5_000,
});
return (
<div className="space-y-6">
<div>
<h1 className="text-2xl font-semibold tracking-tight">Dashboard</h1>
<p className="text-sm text-muted-foreground">Operational status of the ingestion stack.</p>
</div>
<div className="grid gap-4 md:grid-cols-3">
<ServiceCard name="Ingest" status={statusFromQuery(ingestHealth)} port={3049} />
<ServiceCard name="Bulker" status={statusFromQuery(bulkerHealth)} port={3042} />
<ServiceCard name="Rotor" status="unknown" port={3401} />
</div>
<Card>
<CardHeader>
<CardTitle>Getting started</CardTitle>
<CardDescription>Send a test event with the dev write key.</CardDescription>
</CardHeader>
<CardContent>
<pre className="overflow-x-auto rounded-md bg-muted p-4 text-xs">
{`curl -X POST http://localhost:3049/track \\
-H 'Authorization: Bearer cdp_dev_writekey_1234567890' \\
-H 'Content-Type: application/json' \\
-d '{
"type": "track",
"messageId": "m_${'${'}Date.now()${'}'}",
"anonymousId": "anon_1",
"event": "Signed Up",
"properties": { "plan": "pro" }
}'`}
</pre>
</CardContent>
</Card>
</div>
);
}
function ServiceCard({ name, status, port }: { name: string; status: ServiceStatus; port: number }) {
return (
<Card>
<CardHeader>
<div className="flex items-center justify-between">
<CardTitle>{name}</CardTitle>
<StatusBadge status={status} />
</div>
<CardDescription>localhost:{port}</CardDescription>
</CardHeader>
</Card>
);
}
type ServiceStatus = 'ok' | 'down' | 'unknown';
function statusFromQuery(q: { isLoading: boolean; isError: boolean; data?: { status: string } }): ServiceStatus {
if (q.isLoading) return 'unknown';
if (q.isError) return 'down';
return q.data?.status === 'ok' ? 'ok' : 'down';
}
function StatusBadge({ status }: { status: ServiceStatus }) {
if (status === 'ok') return <Badge variant="success">healthy</Badge>;
if (status === 'down') return <Badge variant="destructive">down</Badge>;
return <Badge variant="secondary">unknown</Badge>;
}

View File

@@ -0,0 +1,43 @@
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card';
import { Badge } from '@/components/ui/badge';
import { Button } from '@/components/ui/button';
import { Database, Plus } from 'lucide-react';
const destinations = [
{ id: '1', name: 'ClickHouse (warehouse)', kind: 'clickhouse', enabled: true },
{ id: '2', name: 'BigQuery (BI)', kind: 'bigquery', enabled: false },
];
export function DestinationsPage() {
return (
<div className="space-y-6">
<div className="flex items-center justify-between">
<div>
<h1 className="text-2xl font-semibold tracking-tight">Destinations</h1>
<p className="text-sm text-muted-foreground">Where events end up.</p>
</div>
<Button>
<Plus className="mr-2 h-4 w-4" /> New destination
</Button>
</div>
<div className="grid gap-4 md:grid-cols-2">
{destinations.map((d) => (
<Card key={d.id}>
<CardHeader className="flex flex-row items-start justify-between space-y-0">
<div className="flex items-center gap-3">
<div className="rounded-md bg-muted p-2"><Database className="h-5 w-5" /></div>
<div>
<CardTitle>{d.name}</CardTitle>
<CardDescription>{d.kind}</CardDescription>
</div>
</div>
{d.enabled ? <Badge variant="success">on</Badge> : <Badge variant="secondary">off</Badge>}
</CardHeader>
<CardContent />
</Card>
))}
</div>
</div>
);
}

View File

@@ -0,0 +1,101 @@
import { useState } from 'react';
import { useMutation } from '@tanstack/react-query';
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card';
import { Button } from '@/components/ui/button';
import { ApiError, rotor } from '@/api/client';
const DEFAULT_CODE = `// Transform an event before it's stored.
// Return the (possibly modified) event, null to drop, or an array to fan out.
function transform(event) {
event.properties = event.properties || {};
event.properties.tagged_at = new Date().toISOString();
return event;
}`;
const DEFAULT_EVENT = JSON.stringify({
workspace_id: 'ws-1',
source_id: 'src-1',
message_id: 'm-1',
type: 'track',
event: 'Signed Up',
properties: { plan: 'pro' },
}, null, 2);
export function FunctionsPage() {
const [code, setCode] = useState(DEFAULT_CODE);
const [eventText, setEventText] = useState(DEFAULT_EVENT);
const [output, setOutput] = useState<string>('');
const run = useMutation({
mutationFn: async () => {
let event: Record<string, unknown>;
try {
event = JSON.parse(eventText);
} catch (err) {
throw new ApiError(400, `event is not valid JSON: ${(err as Error).message}`);
}
return rotor.run({ code, event });
},
onSuccess: (data) => setOutput(JSON.stringify(data.result, null, 2)),
onError: (err: ApiError) => setOutput(`ERROR (${err.status}): ${err.message}`),
});
return (
<div className="space-y-6">
<div>
<h1 className="text-2xl font-semibold tracking-tight">Functions</h1>
<p className="text-sm text-muted-foreground">Author and test transformation functions.</p>
</div>
<div className="grid gap-4 lg:grid-cols-2">
<Card>
<CardHeader>
<CardTitle>Code</CardTitle>
<CardDescription>Define <code>transform(event)</code>.</CardDescription>
</CardHeader>
<CardContent>
<textarea
value={code}
onChange={(e) => setCode(e.target.value)}
spellCheck={false}
className="h-72 w-full rounded-md border bg-muted p-3 font-mono text-xs"
/>
</CardContent>
</Card>
<Card>
<CardHeader>
<CardTitle>Sample event</CardTitle>
<CardDescription>The input passed to <code>transform()</code>.</CardDescription>
</CardHeader>
<CardContent>
<textarea
value={eventText}
onChange={(e) => setEventText(e.target.value)}
spellCheck={false}
className="h-72 w-full rounded-md border bg-muted p-3 font-mono text-xs"
/>
</CardContent>
</Card>
</div>
<div className="flex items-center gap-3">
<Button onClick={() => run.mutate()} disabled={run.isPending}>
{run.isPending ? 'Running…' : 'Run'}
</Button>
<span className="text-sm text-muted-foreground">
rotor will execute in a V8 isolate with a 2s timeout
</span>
</div>
<Card>
<CardHeader>
<CardTitle>Output</CardTitle>
</CardHeader>
<CardContent>
<pre className="min-h-32 overflow-auto rounded-md bg-muted p-3 text-xs">{output || '— run to see output —'}</pre>
</CardContent>
</Card>
</div>
);
}

View File

@@ -0,0 +1,74 @@
import { useState } from 'react';
import { useMutation } from '@tanstack/react-query';
import { Button } from '@/components/ui/button';
import { Input } from '@/components/ui/input';
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card';
import { ApiError, ingest } from '@/api/client';
interface LogEntry {
ts: string;
ok: boolean;
message: string;
}
export function LivePage() {
const [writeKey, setWriteKey] = useState('cdp_dev_writekey_1234567890');
const [logs, setLogs] = useState<LogEntry[]>([]);
const send = useMutation({
mutationFn: async () =>
ingest.track(writeKey, {
type: 'track',
messageId: 'm_' + Date.now() + '_' + Math.random().toString(36).slice(2, 7),
anonymousId: 'anon_console',
event: 'Console Test',
properties: { source: 'console', at: new Date().toISOString() },
}),
onSuccess: () =>
setLogs((prev) => [{ ts: new Date().toLocaleTimeString(), ok: true, message: 'event accepted' }, ...prev].slice(0, 50)),
onError: (err: ApiError) =>
setLogs((prev) => [{ ts: new Date().toLocaleTimeString(), ok: false, message: `${err.status} ${err.message}` }, ...prev].slice(0, 50)),
});
return (
<div className="space-y-6">
<div>
<h1 className="text-2xl font-semibold tracking-tight">Live events</h1>
<p className="text-sm text-muted-foreground">Send a synthetic event and watch the response.</p>
</div>
<Card>
<CardHeader>
<CardTitle>Send test event</CardTitle>
<CardDescription>Uses the dev write key by default.</CardDescription>
</CardHeader>
<CardContent className="space-y-3">
<div className="flex gap-2">
<Input value={writeKey} onChange={(e) => setWriteKey(e.target.value)} placeholder="write key" />
<Button onClick={() => send.mutate()} disabled={send.isPending}>
{send.isPending ? 'Sending…' : 'Send'}
</Button>
</div>
</CardContent>
</Card>
<Card>
<CardHeader>
<CardTitle>Log</CardTitle>
</CardHeader>
<CardContent>
<div className="space-y-1 font-mono text-xs">
{logs.length === 0 && <div className="text-muted-foreground"> no events yet </div>}
{logs.map((l, i) => (
<div key={i} className="flex gap-3">
<span className="text-muted-foreground">{l.ts}</span>
<span className={l.ok ? 'text-emerald-600' : 'text-destructive'}>{l.ok ? 'OK' : 'ERR'}</span>
<span>{l.message}</span>
</div>
))}
</div>
</CardContent>
</Card>
</div>
);
}

View File

@@ -0,0 +1,30 @@
import { useWorkspace } from '@/stores/workspace';
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card';
import { Input } from '@/components/ui/input';
export function SettingsPage() {
const { currentWorkspace, setCurrentWorkspace } = useWorkspace();
return (
<div className="space-y-6">
<div>
<h1 className="text-2xl font-semibold tracking-tight">Settings</h1>
<p className="text-sm text-muted-foreground">Workspace configuration.</p>
</div>
<Card>
<CardHeader>
<CardTitle>Workspace</CardTitle>
<CardDescription>Identifier used by the local console state.</CardDescription>
</CardHeader>
<CardContent>
<Input
value={currentWorkspace}
onChange={(e) => setCurrentWorkspace(e.target.value)}
placeholder="workspace slug"
/>
</CardContent>
</Card>
</div>
);
}

View File

@@ -0,0 +1,46 @@
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card';
import { Badge } from '@/components/ui/badge';
import { Button } from '@/components/ui/button';
import { Plus } from 'lucide-react';
// Placeholder data -- swap to a TanStack Query call against the control-plane
// API when /sources endpoints land.
const sources = [
{ id: '1', name: 'Web tracker', kind: 'web', enabled: true, events_24h: 12_482 },
{ id: '2', name: 'iOS app', kind: 'mobile', enabled: true, events_24h: 4_201 },
{ id: '3', name: 'Server', kind: 'server', enabled: false, events_24h: 0 },
];
export function SourcesPage() {
return (
<div className="space-y-6">
<div className="flex items-center justify-between">
<div>
<h1 className="text-2xl font-semibold tracking-tight">Sources</h1>
<p className="text-sm text-muted-foreground">Where events come from.</p>
</div>
<Button>
<Plus className="mr-2 h-4 w-4" /> New source
</Button>
</div>
<div className="grid gap-4 md:grid-cols-2 lg:grid-cols-3">
{sources.map((s) => (
<Card key={s.id}>
<CardHeader>
<div className="flex items-center justify-between">
<CardTitle>{s.name}</CardTitle>
{s.enabled ? <Badge variant="success">on</Badge> : <Badge variant="secondary">off</Badge>}
</div>
<CardDescription>{s.kind}</CardDescription>
</CardHeader>
<CardContent>
<div className="text-sm text-muted-foreground">last 24h</div>
<div className="text-2xl font-semibold">{s.events_24h.toLocaleString()}</div>
</CardContent>
</Card>
))}
</div>
</div>
);
}

View File

@@ -0,0 +1,13 @@
import { create } from 'zustand';
interface WorkspaceState {
// For the scaffold we keep this purely local. Replace with API-loaded list
// when the control-plane endpoints are wired up.
currentWorkspace: string;
setCurrentWorkspace: (id: string) => void;
}
export const useWorkspace = create<WorkspaceState>((set) => ({
currentWorkspace: 'dev',
setCurrentWorkspace: (id) => set({ currentWorkspace: id }),
}));

View File

@@ -0,0 +1,50 @@
import type { Config } from 'tailwindcss';
const config: Config = {
darkMode: ['class'],
content: ['./index.html', './src/**/*.{ts,tsx}'],
theme: {
container: {
center: true,
padding: '1rem',
screens: { '2xl': '1400px' },
},
extend: {
colors: {
border: 'hsl(var(--border))',
input: 'hsl(var(--input))',
ring: 'hsl(var(--ring))',
background: 'hsl(var(--background))',
foreground: 'hsl(var(--foreground))',
primary: {
DEFAULT: 'hsl(var(--primary))',
foreground: 'hsl(var(--primary-foreground))',
},
muted: {
DEFAULT: 'hsl(var(--muted))',
foreground: 'hsl(var(--muted-foreground))',
},
accent: {
DEFAULT: 'hsl(var(--accent))',
foreground: 'hsl(var(--accent-foreground))',
},
destructive: {
DEFAULT: 'hsl(var(--destructive))',
foreground: 'hsl(var(--destructive-foreground))',
},
card: {
DEFAULT: 'hsl(var(--card))',
foreground: 'hsl(var(--card-foreground))',
},
},
borderRadius: {
lg: 'var(--radius)',
md: 'calc(var(--radius) - 2px)',
sm: 'calc(var(--radius) - 4px)',
},
},
},
plugins: [],
};
export default config;

View File

@@ -0,0 +1,22 @@
{
"compilerOptions": {
"target": "ES2022",
"useDefineForClassFields": true,
"lib": ["ES2022", "DOM", "DOM.Iterable"],
"module": "ESNext",
"skipLibCheck": true,
"moduleResolution": "bundler",
"allowImportingTsExtensions": true,
"resolveJsonModule": true,
"isolatedModules": true,
"noEmit": true,
"jsx": "react-jsx",
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"noFallthroughCasesInSwitch": true,
"baseUrl": ".",
"paths": { "@/*": ["./src/*"] }
},
"include": ["src", "vite.config.ts"]
}

View File

@@ -0,0 +1,20 @@
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
import path from 'node:path';
export default defineConfig({
plugins: [react()],
resolve: {
alias: {
'@': path.resolve(__dirname, './src'),
},
},
server: {
port: 3000,
proxy: {
'/api/ingest': { target: 'http://localhost:3049', changeOrigin: true, rewrite: (p) => p.replace(/^\/api\/ingest/, '') },
'/api/bulker': { target: 'http://localhost:3042', changeOrigin: true, rewrite: (p) => p.replace(/^\/api\/bulker/, '') },
'/api/rotor': { target: 'http://localhost:3401', changeOrigin: true, rewrite: (p) => p.replace(/^\/api\/rotor/, '') },
},
},
});

View File

@@ -0,0 +1,5 @@
DROP TABLE IF EXISTS events_dlq;
DROP TABLE IF EXISTS events_group;
DROP TABLE IF EXISTS events_page;
DROP TABLE IF EXISTS events_identify;
DROP TABLE IF EXISTS events_track;

View File

@@ -0,0 +1,117 @@
-- ---------------------------------------------------------------------------
-- ClickHouse event store.
--
-- One wide table per event family (track / identify / page / group).
-- All share the same key columns. Custom fields are flattened into the
-- properties / traits Map.
-- ---------------------------------------------------------------------------
CREATE TABLE IF NOT EXISTS events_track
(
workspace_id String,
source_id String,
message_id String,
anonymous_id String,
user_id String,
event String,
timestamp DateTime64(3, 'UTC'),
sent_at DateTime64(3, 'UTC'),
received_at DateTime64(3, 'UTC'),
properties Map(String, String),
context Map(String, String),
ip String,
user_agent String,
library_name String,
library_version String
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(received_at)
ORDER BY (workspace_id, source_id, received_at, message_id)
TTL toDateTime(received_at) + INTERVAL 18 MONTH
SETTINGS index_granularity = 8192;
CREATE TABLE IF NOT EXISTS events_identify
(
workspace_id String,
source_id String,
message_id String,
anonymous_id String,
user_id String,
timestamp DateTime64(3, 'UTC'),
sent_at DateTime64(3, 'UTC'),
received_at DateTime64(3, 'UTC'),
traits Map(String, String),
context Map(String, String),
ip String,
user_agent String
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(received_at)
ORDER BY (workspace_id, source_id, received_at, message_id)
TTL toDateTime(received_at) + INTERVAL 18 MONTH
SETTINGS index_granularity = 8192;
CREATE TABLE IF NOT EXISTS events_page
(
workspace_id String,
source_id String,
message_id String,
anonymous_id String,
user_id String,
name String,
category String,
timestamp DateTime64(3, 'UTC'),
sent_at DateTime64(3, 'UTC'),
received_at DateTime64(3, 'UTC'),
properties Map(String, String),
context Map(String, String),
ip String,
user_agent String,
referrer String,
path String,
url String
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(received_at)
ORDER BY (workspace_id, source_id, received_at, message_id)
TTL toDateTime(received_at) + INTERVAL 18 MONTH
SETTINGS index_granularity = 8192;
CREATE TABLE IF NOT EXISTS events_group
(
workspace_id String,
source_id String,
message_id String,
anonymous_id String,
user_id String,
group_id String,
timestamp DateTime64(3, 'UTC'),
sent_at DateTime64(3, 'UTC'),
received_at DateTime64(3, 'UTC'),
traits Map(String, String),
context Map(String, String),
ip String,
user_agent String
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(received_at)
ORDER BY (workspace_id, source_id, received_at, message_id)
TTL toDateTime(received_at) + INTERVAL 18 MONTH
SETTINGS index_granularity = 8192;
-- Dead-letter queue for events that failed validation / transformation.
CREATE TABLE IF NOT EXISTS events_dlq
(
workspace_id String,
source_id String,
message_id String,
received_at DateTime64(3, 'UTC'),
reason String,
field String,
raw_payload String
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(received_at)
ORDER BY (workspace_id, received_at)
TTL toDateTime(received_at) + INTERVAL 30 DAY
SETTINGS index_granularity = 8192;

View File

@@ -0,0 +1,19 @@
<?xml version="1.0"?>
<clickhouse>
<!-- Reasonable defaults for local development -->
<logger>
<level>information</level>
<console>1</console>
</logger>
<listen_host>0.0.0.0</listen_host>
<!-- Allow connections from any user/host in dev -->
<profiles>
<default>
<max_memory_usage>4000000000</max_memory_usage>
<use_uncompressed_cache>0</use_uncompressed_cache>
<load_balancing>random</load_balancing>
</default>
</profiles>
</clickhouse>

View File

@@ -0,0 +1,113 @@
version: "3.9"
# ---------------------------------------------------------------------------
# CDP Ingestion - local development infrastructure
#
# Brings up: PostgreSQL, Redis, Redpanda (Kafka), ClickHouse, Redpanda Console
# ---------------------------------------------------------------------------
services:
postgres:
image: postgres:16-alpine
container_name: cdp-postgres
restart: unless-stopped
environment:
POSTGRES_USER: cdp
POSTGRES_PASSWORD: cdp
POSTGRES_DB: cdp
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U cdp -d cdp"]
interval: 5s
timeout: 3s
retries: 10
redis:
image: redis:7-alpine
container_name: cdp-redis
restart: unless-stopped
command: ["redis-server", "--appendonly", "yes", "--save", "60", "1"]
ports:
- "6379:6379"
volumes:
- redis_data:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 10
redpanda:
image: redpandadata/redpanda:v24.2.7
container_name: cdp-redpanda
restart: unless-stopped
command:
- redpanda
- start
- --kafka-addr=internal://0.0.0.0:9092,external://0.0.0.0:19092
- --advertise-kafka-addr=internal://redpanda:9092,external://localhost:19092
- --pandaproxy-addr=internal://0.0.0.0:8082,external://0.0.0.0:18082
- --advertise-pandaproxy-addr=internal://redpanda:8082,external://localhost:18082
- --schema-registry-addr=internal://0.0.0.0:8081,external://0.0.0.0:18081
- --rpc-addr=0.0.0.0:33145
- --advertise-rpc-addr=redpanda:33145
- --smp=1
- --memory=1G
- --overprovisioned
- --node-id=0
- --check=false
ports:
- "9092:9092"
- "19092:19092"
- "9644:9644"
volumes:
- redpanda_data:/var/lib/redpanda/data
healthcheck:
test: ["CMD-SHELL", "rpk cluster health | grep -E 'Healthy:.+true' || exit 1"]
interval: 10s
timeout: 5s
retries: 10
redpanda-console:
image: redpandadata/console:v2.7.2
container_name: cdp-redpanda-console
restart: unless-stopped
depends_on:
- redpanda
environment:
KAFKA_BROKERS: redpanda:9092
ports:
- "8080:8080"
clickhouse:
image: clickhouse/clickhouse-server:24.8
container_name: cdp-clickhouse
restart: unless-stopped
ulimits:
nofile:
soft: 262144
hard: 262144
environment:
CLICKHOUSE_DB: cdp
CLICKHOUSE_USER: default
CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT: "1"
ports:
- "8123:8123" # HTTP
- "9000:9000" # Native
volumes:
- clickhouse_data:/var/lib/clickhouse
- ./clickhouse-config.xml:/etc/clickhouse-server/config.d/local.xml:ro
healthcheck:
test: ["CMD-SHELL", "wget -qO- http://localhost:8123/ping | grep -q Ok"]
interval: 5s
timeout: 3s
retries: 10
volumes:
postgres_data:
redis_data:
redpanda_data:
clickhouse_data:

View File

@@ -0,0 +1,12 @@
DROP TABLE IF EXISTS audit_log;
DROP TABLE IF EXISTS schema_fields;
DROP TABLE IF EXISTS function_attachments;
DROP TABLE IF EXISTS functions;
DROP TABLE IF EXISTS source_destination_links;
DROP TABLE IF EXISTS destinations;
DROP TABLE IF EXISTS write_keys;
DROP TABLE IF EXISTS sources;
DROP TABLE IF EXISTS workspace_members;
DROP TABLE IF EXISTS users;
DROP TABLE IF EXISTS workspaces;
DROP EXTENSION IF EXISTS "pgcrypto";

View File

@@ -0,0 +1,178 @@
-- ---------------------------------------------------------------------------
-- Initial schema for CDP Ingestion control plane.
--
-- This database stores configuration, not events. Events live in ClickHouse.
-- ---------------------------------------------------------------------------
CREATE EXTENSION IF NOT EXISTS "pgcrypto";
-- ---------------------------------------------------------------------------
-- workspaces
-- ---------------------------------------------------------------------------
CREATE TABLE workspaces (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
slug TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
tier TEXT NOT NULL DEFAULT 'default'
CHECK (tier IN ('default', 'pro', 'enterprise')),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
deleted_at TIMESTAMPTZ
);
CREATE INDEX idx_workspaces_slug ON workspaces (slug) WHERE deleted_at IS NULL;
-- ---------------------------------------------------------------------------
-- users (console operators)
-- ---------------------------------------------------------------------------
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email TEXT NOT NULL UNIQUE,
password_hash TEXT NOT NULL,
name TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE workspace_members (
workspace_id UUID NOT NULL REFERENCES workspaces (id) ON DELETE CASCADE,
user_id UUID NOT NULL REFERENCES users (id) ON DELETE CASCADE,
role TEXT NOT NULL DEFAULT 'member'
CHECK (role IN ('owner', 'admin', 'member', 'viewer')),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (workspace_id, user_id)
);
-- ---------------------------------------------------------------------------
-- sources -- each source is something that pushes events (web, mobile, server)
-- ---------------------------------------------------------------------------
CREATE TABLE sources (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID NOT NULL REFERENCES workspaces (id) ON DELETE CASCADE,
slug TEXT NOT NULL,
name TEXT NOT NULL,
kind TEXT NOT NULL
CHECK (kind IN ('web', 'mobile', 'server', 'segment', 'webhook')),
enabled BOOLEAN NOT NULL DEFAULT TRUE,
settings JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
deleted_at TIMESTAMPTZ,
UNIQUE (workspace_id, slug)
);
CREATE INDEX idx_sources_workspace ON sources (workspace_id) WHERE deleted_at IS NULL;
-- ---------------------------------------------------------------------------
-- write_keys -- API auth tokens, scoped to a source
-- ---------------------------------------------------------------------------
CREATE TABLE write_keys (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID NOT NULL REFERENCES workspaces (id) ON DELETE CASCADE,
source_id UUID NOT NULL REFERENCES sources (id) ON DELETE CASCADE,
key_hash TEXT NOT NULL UNIQUE, -- store hash, never raw
key_prefix TEXT NOT NULL, -- first ~8 chars for display
label TEXT,
revoked_at TIMESTAMPTZ,
last_used_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_write_keys_workspace ON write_keys (workspace_id) WHERE revoked_at IS NULL;
CREATE INDEX idx_write_keys_source ON write_keys (source_id) WHERE revoked_at IS NULL;
-- ---------------------------------------------------------------------------
-- destinations -- where events are forwarded (clickhouse, snowflake, bq, s3...)
-- ---------------------------------------------------------------------------
CREATE TABLE destinations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID NOT NULL REFERENCES workspaces (id) ON DELETE CASCADE,
slug TEXT NOT NULL,
name TEXT NOT NULL,
kind TEXT NOT NULL
CHECK (kind IN ('clickhouse', 'postgres', 'snowflake', 'bigquery',
'redshift', 's3', 'webhook')),
enabled BOOLEAN NOT NULL DEFAULT TRUE,
config JSONB NOT NULL DEFAULT '{}', -- credentials encrypted at rest
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
deleted_at TIMESTAMPTZ,
UNIQUE (workspace_id, slug)
);
CREATE INDEX idx_destinations_workspace ON destinations (workspace_id) WHERE deleted_at IS NULL;
-- source -> destination wiring
CREATE TABLE source_destination_links (
source_id UUID NOT NULL REFERENCES sources (id) ON DELETE CASCADE,
destination_id UUID NOT NULL REFERENCES destinations (id) ON DELETE CASCADE,
enabled BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (source_id, destination_id)
);
-- ---------------------------------------------------------------------------
-- functions -- JS transformation code run by rotor
-- ---------------------------------------------------------------------------
CREATE TABLE functions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID NOT NULL REFERENCES workspaces (id) ON DELETE CASCADE,
slug TEXT NOT NULL,
name TEXT NOT NULL,
code TEXT NOT NULL,
enabled BOOLEAN NOT NULL DEFAULT TRUE,
version INTEGER NOT NULL DEFAULT 1,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
deleted_at TIMESTAMPTZ,
UNIQUE (workspace_id, slug)
);
CREATE TABLE function_attachments (
source_id UUID REFERENCES sources (id) ON DELETE CASCADE,
destination_id UUID REFERENCES destinations (id) ON DELETE CASCADE,
function_id UUID NOT NULL REFERENCES functions (id) ON DELETE CASCADE,
position INTEGER NOT NULL DEFAULT 0,
enabled BOOLEAN NOT NULL DEFAULT TRUE,
CHECK (
(source_id IS NOT NULL AND destination_id IS NULL) OR
(source_id IS NULL AND destination_id IS NOT NULL)
)
);
CREATE INDEX idx_function_attachments_source ON function_attachments (source_id);
CREATE INDEX idx_function_attachments_destination ON function_attachments (destination_id);
-- ---------------------------------------------------------------------------
-- schema_fields -- discovered field types per (workspace, event_type, field)
-- ---------------------------------------------------------------------------
CREATE TABLE schema_fields (
workspace_id UUID NOT NULL REFERENCES workspaces (id) ON DELETE CASCADE,
event_type TEXT NOT NULL,
field TEXT NOT NULL,
data_type TEXT NOT NULL
CHECK (data_type IN ('string', 'number', 'boolean',
'object', 'array', 'timestamp', 'null')),
first_seen_at TIMESTAMPTZ NOT NULL DEFAULT now(),
last_seen_at TIMESTAMPTZ NOT NULL DEFAULT now(),
sample_count BIGINT NOT NULL DEFAULT 1,
PRIMARY KEY (workspace_id, event_type, field)
);
CREATE INDEX idx_schema_fields_event ON schema_fields (workspace_id, event_type);
-- ---------------------------------------------------------------------------
-- audit_log -- security-relevant operations
-- ---------------------------------------------------------------------------
CREATE TABLE audit_log (
id BIGSERIAL PRIMARY KEY,
workspace_id UUID REFERENCES workspaces (id) ON DELETE SET NULL,
actor_id UUID REFERENCES users (id) ON DELETE SET NULL,
action TEXT NOT NULL,
target_type TEXT,
target_id TEXT,
metadata JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_audit_log_workspace ON audit_log (workspace_id, created_at DESC);

View File

@@ -0,0 +1,3 @@
DELETE FROM write_keys WHERE id = '00000000-0000-0000-0000-000000000100';
DELETE FROM sources WHERE id = '00000000-0000-0000-0000-000000000010';
DELETE FROM workspaces WHERE id = '00000000-0000-0000-0000-000000000001';

View File

@@ -0,0 +1,24 @@
-- ---------------------------------------------------------------------------
-- Local dev seed -- a default workspace + source + write key.
-- The plaintext write key for development is: cdp_dev_writekey_1234567890
-- key_hash below is sha256 of that string.
-- ---------------------------------------------------------------------------
INSERT INTO workspaces (id, slug, name, tier)
VALUES ('00000000-0000-0000-0000-000000000001', 'dev', 'Dev Workspace', 'default')
ON CONFLICT (slug) DO NOTHING;
INSERT INTO sources (id, workspace_id, slug, name, kind)
VALUES ('00000000-0000-0000-0000-000000000010',
'00000000-0000-0000-0000-000000000001',
'web', 'Dev Web Source', 'web')
ON CONFLICT (workspace_id, slug) DO NOTHING;
INSERT INTO write_keys (id, workspace_id, source_id, key_hash, key_prefix, label)
VALUES ('00000000-0000-0000-0000-000000000100',
'00000000-0000-0000-0000-000000000001',
'00000000-0000-0000-0000-000000000010',
encode(digest('cdp_dev_writekey_1234567890', 'sha256'), 'hex'),
'cdp_dev_',
'dev key')
ON CONFLICT (key_hash) DO NOTHING;

View File

@@ -0,0 +1,67 @@
#!/usr/bin/env bash
# Apply / drop ClickHouse DDL files in alphabetical order.
#
# Usage:
# clickhouse_apply.sh up apply *.up.sql in infra/clickhouse/
# clickhouse_apply.sh down apply *.down.sql in REVERSE order
#
# Env:
# CLICKHOUSE_ADDR (default localhost:9000)
# CLICKHOUSE_DB (default cdp)
# CLICKHOUSE_USER (default default)
# CLICKHOUSE_PASSWORD (default empty)
set -euo pipefail
DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)/clickhouse"
ADDR="${CLICKHOUSE_ADDR:-localhost:9000}"
DB="${CLICKHOUSE_DB:-cdp}"
USER="${CLICKHOUSE_USER:-default}"
PASS="${CLICKHOUSE_PASSWORD:-}"
MODE="${1:-up}"
host="${ADDR%%:*}"
port="${ADDR##*:}"
run_sql() {
local file="$1"
echo ">>> applying $(basename "$file")"
if [[ -n "$PASS" ]]; then
clickhouse-client --host "$host" --port "$port" --user "$USER" --password "$PASS" \
--database "$DB" --multiquery --queries-file "$file"
else
clickhouse-client --host "$host" --port "$port" --user "$USER" \
--database "$DB" --multiquery --queries-file "$file"
fi
}
ensure_db() {
if [[ -n "$PASS" ]]; then
clickhouse-client --host "$host" --port "$port" --user "$USER" --password "$PASS" \
--query "CREATE DATABASE IF NOT EXISTS $DB"
else
clickhouse-client --host "$host" --port "$port" --user "$USER" \
--query "CREATE DATABASE IF NOT EXISTS $DB"
fi
}
case "$MODE" in
up)
ensure_db
for f in $(ls "$DIR"/*.up.sql 2>/dev/null | sort); do
run_sql "$f"
done
;;
down)
for f in $(ls "$DIR"/*.down.sql 2>/dev/null | sort -r); do
run_sql "$f"
done
;;
*)
echo "usage: $0 {up|down}"
exit 1
;;
esac
echo "done."

View File

@@ -0,0 +1,12 @@
FROM golang:1.22-alpine AS build
WORKDIR /src
COPY go.mod go.sum* ./
RUN go mod download || true
COPY . .
RUN CGO_ENABLED=0 go build -trimpath -ldflags="-s -w" -o /out/ingest ./cmd/server
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=build /out/ingest /ingest
EXPOSE 3049
USER nonroot:nonroot
ENTRYPOINT ["/ingest"]

View File

@@ -0,0 +1,157 @@
// Command server runs the CDP ingest HTTP API.
package main
import (
"context"
"errors"
"log"
"net/http"
"os"
"os/signal"
"syscall"
"time"
"github.com/go-chi/chi/v5"
"github.com/redis/rueidis"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/ingest/internal/config"
"github.com/dbiz/cdp/ingestion/ingest/internal/dedup"
"github.com/dbiz/cdp/ingestion/ingest/internal/handler"
"github.com/dbiz/cdp/ingestion/ingest/internal/kafka"
mw "github.com/dbiz/cdp/ingestion/ingest/internal/middleware"
"github.com/dbiz/cdp/ingestion/ingest/internal/ratelimit"
"github.com/dbiz/cdp/ingestion/ingest/internal/repo"
"github.com/dbiz/cdp/ingestion/ingest/internal/service"
)
func main() {
if err := run(); err != nil {
log.Fatal(err)
}
}
func run() error {
cfg, err := config.Load()
if err != nil {
return err
}
logger, err := newLogger(cfg.LogLevel)
if err != nil {
return err
}
defer func() { _ = logger.Sync() }()
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// ---- infra clients ----------------------------------------------------
pg, err := repo.NewPool(ctx, cfg.PostgresDSN)
if err != nil {
return err
}
defer pg.Close()
redisClient, err := rueidis.NewClient(rueidis.ClientOption{
InitAddress: []string{cfg.RedisAddr},
})
if err != nil {
return err
}
defer redisClient.Close()
producer, err := kafka.NewProducer(cfg.KafkaBrokers,
cfg.KafkaTopicIngest, cfg.KafkaTopicDLQ, cfg.KafkaTopicRetry, logger)
if err != nil {
return err
}
defer producer.Close()
// ---- repos / services -------------------------------------------------
writeKeyRepo := repo.NewWriteKeyRepo(pg)
schemaRepo := repo.NewSchemaRepo(pg)
authSvc := service.NewAuthService(writeKeyRepo, redisClient, cfg.WriteKeyCacheTTL, logger)
ingestSvc := service.NewIngestService(service.IngestDeps{
Producer: producer,
Limiter: ratelimit.New(redisClient),
Dedup: dedup.New(redisClient, time.Duration(cfg.DedupTTLHours)*time.Hour),
Schema: schemaRepo,
Log: logger,
LateAfter: time.Duration(cfg.LateEventHours) * time.Hour,
})
evHandler := handler.NewEventHandler(ingestSvc, logger)
// ---- HTTP router ------------------------------------------------------
r := chi.NewRouter()
r.Use(mw.RequestID)
r.Use(mw.Recover(logger))
r.Use(mw.Logger(logger))
r.Use(mw.CORS)
r.Use(mw.PayloadLimit(cfg.PayloadLimitKB))
// public health endpoints
r.Get("/health", evHandler.Health)
r.Get("/ready", evHandler.Ready)
// authenticated routes
r.Group(func(rr chi.Router) {
rr.Use(mw.Auth(authSvc))
rr.Post("/track", evHandler.Track)
rr.Post("/identify", evHandler.Identify)
rr.Post("/page", evHandler.Page)
rr.Post("/group", evHandler.Group)
rr.Post("/alias", evHandler.Alias)
rr.Post("/screen", evHandler.Screen)
// batch has its own (larger) payload limit
rr.With(mw.PayloadLimit(cfg.BatchLimitKB)).Post("/batch", evHandler.Batch)
// Segment compatibility paths
rr.With(mw.PayloadLimit(cfg.BatchLimitKB)).Post("/v1/batch", evHandler.Batch)
rr.Post("/v1/track", evHandler.Track)
rr.Post("/v1/identify", evHandler.Identify)
rr.Post("/v1/page", evHandler.Page)
rr.Post("/v1/group", evHandler.Group)
rr.Post("/v1/alias", evHandler.Alias)
rr.Post("/v1/screen", evHandler.Screen)
})
srv := &http.Server{
Addr: cfg.HTTPAddr,
Handler: r,
ReadHeaderTimeout: 5 * time.Second,
ReadTimeout: 30 * time.Second,
WriteTimeout: 30 * time.Second,
IdleTimeout: 120 * time.Second,
}
// ---- graceful shutdown ------------------------------------------------
shutdownErr := make(chan error, 1)
go func() {
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
<-sigCh
logger.Info("shutdown signal received; draining...")
shutCtx, cancel := context.WithTimeout(context.Background(), cfg.ShutdownTimeout)
defer cancel()
shutdownErr <- srv.Shutdown(shutCtx)
}()
logger.Info("ingest listening", zap.String("addr", cfg.HTTPAddr))
if err := srv.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
return err
}
return <-shutdownErr
}
func newLogger(level string) (*zap.Logger, error) {
lvl, err := zap.ParseAtomicLevel(level)
if err != nil {
lvl = zap.NewAtomicLevelAt(zap.InfoLevel)
}
cfg := zap.NewProductionConfig()
cfg.Level = lvl
cfg.EncoderConfig.TimeKey = "ts"
cfg.EncoderConfig.MessageKey = "msg"
return cfg.Build()
}

15
ingestion/ingest/go.mod Normal file
View File

@@ -0,0 +1,15 @@
module github.com/dbiz/cdp/ingestion/ingest
go 1.22
require (
github.com/caarlos0/env/v11 v11.2.2
github.com/go-chi/chi/v5 v5.1.0
github.com/go-playground/validator/v10 v10.22.1
github.com/google/uuid v1.6.0
github.com/jackc/pgx/v5 v5.6.0
github.com/redis/rueidis v1.0.45
github.com/stretchr/testify v1.9.0
github.com/twmb/franz-go v1.17.1
go.uber.org/zap v1.27.0
)

View File

@@ -0,0 +1,79 @@
// Package apperr defines AppError, the single error type returned by every
// service/repo function. Handlers translate AppError into HTTP responses.
package apperr
import (
"errors"
"fmt"
"net/http"
)
type AppError struct {
Code int // HTTP status to return
Message string // user-facing message (safe to expose)
Field string // optional: which field caused the error
RetryAfter int // seconds, for 429
Err error // original error for logging (never exposed)
}
func (e *AppError) Error() string {
if e.Err != nil {
return fmt.Sprintf("%s: %v", e.Message, e.Err)
}
return e.Message
}
func (e *AppError) Unwrap() error { return e.Err }
// As reports whether err is or wraps an *AppError.
func As(err error) (*AppError, bool) {
var ae *AppError
if errors.As(err, &ae) {
return ae, true
}
return nil, false
}
// ---------------------------------------------------------------------------
// Constructors
// ---------------------------------------------------------------------------
func BadRequest(msg, field string, err error) *AppError {
return &AppError{Code: http.StatusBadRequest, Message: msg, Field: field, Err: err}
}
func Unauthorized(msg string) *AppError {
return &AppError{Code: http.StatusUnauthorized, Message: msg}
}
func Forbidden(msg string) *AppError {
return &AppError{Code: http.StatusForbidden, Message: msg}
}
func NotFound(msg string) *AppError {
return &AppError{Code: http.StatusNotFound, Message: msg}
}
func Conflict(msg string, err error) *AppError {
return &AppError{Code: http.StatusConflict, Message: msg, Err: err}
}
func PayloadTooLarge(msg string) *AppError {
return &AppError{Code: http.StatusRequestEntityTooLarge, Message: msg}
}
func UnprocessableEntity(msg string) *AppError {
return &AppError{Code: http.StatusUnprocessableEntity, Message: msg}
}
func TooManyRequests(retryAfterSeconds int) *AppError {
return &AppError{
Code: http.StatusTooManyRequests,
Message: "rate limit exceeded",
RetryAfter: retryAfterSeconds,
}
}
func Internal(err error) *AppError {
return &AppError{Code: http.StatusInternalServerError, Message: "internal server error", Err: err}
}

View File

@@ -0,0 +1,41 @@
// Package config loads runtime configuration from environment variables.
//
// All env vars are prefixed with INGEST_ except shared infra ones
// (POSTGRES_DSN, REDIS_ADDR, KAFKA_BROKERS).
package config
import (
"fmt"
"time"
"github.com/caarlos0/env/v11"
)
type Config struct {
HTTPAddr string `env:"INGEST_HTTP_ADDR" envDefault:":3049"`
LogLevel string `env:"INGEST_LOG_LEVEL" envDefault:"info"`
PayloadLimitKB int `env:"INGEST_PAYLOAD_LIMIT_KB" envDefault:"100"`
BatchLimitKB int `env:"INGEST_BATCH_LIMIT_KB" envDefault:"4000"`
LateEventHours int `env:"INGEST_LATE_EVENT_HOURS" envDefault:"24"`
DedupTTLHours int `env:"INGEST_DEDUP_TTL_HOURS" envDefault:"24"`
WriteKeyCacheTTL time.Duration `env:"INGEST_WRITE_KEY_CACHE_TTL_SECONDS" envDefault:"45s"`
ShutdownTimeout time.Duration `env:"INGEST_SHUTDOWN_TIMEOUT_SECONDS" envDefault:"30s"`
LogPayloadOnSuccess bool `env:"INGEST_LOG_PAYLOAD_ON_SUCCESS" envDefault:"false"`
LogPayloadOnError bool `env:"INGEST_LOG_PAYLOAD_ON_ERROR" envDefault:"true"`
PostgresDSN string `env:"POSTGRES_DSN,required"`
RedisAddr string `env:"REDIS_ADDR" envDefault:"localhost:6379"`
KafkaBrokers []string `env:"KAFKA_BROKERS" envSeparator:"," envDefault:"localhost:9092"`
KafkaTopicIngest string `env:"KAFKA_TOPIC_INGEST" envDefault:"events.ingest"`
KafkaTopicDLQ string `env:"KAFKA_TOPIC_DLQ" envDefault:"events.dlq"`
KafkaTopicRetry string `env:"KAFKA_TOPIC_RETRY" envDefault:"events.retry"`
}
func Load() (*Config, error) {
cfg := &Config{}
if err := env.Parse(cfg); err != nil {
return nil, fmt.Errorf("config load: %w", err)
}
return cfg, nil
}

View File

@@ -0,0 +1,50 @@
// Package dedup provides idempotent event acceptance via Redis SETNX.
//
// Key shape: dedup:{workspace_id}:{message_id}
// TTL: 24h by default (configurable)
//
// CheckAndSet returns true when the message_id is new (first time seen).
// If it returns false the caller MUST drop the event silently and return 200.
package dedup
import (
"context"
"fmt"
"time"
"github.com/redis/rueidis"
)
type Dedup interface {
CheckAndSet(ctx context.Context, workspaceID, messageID string) (bool, error)
}
type redisDedup struct {
client rueidis.Client
ttl time.Duration
}
func New(client rueidis.Client, ttl time.Duration) Dedup {
return &redisDedup{client: client, ttl: ttl}
}
func key(workspaceID, messageID string) string {
return fmt.Sprintf("dedup:%s:%s", workspaceID, messageID)
}
func (d *redisDedup) CheckAndSet(ctx context.Context, workspaceID, messageID string) (bool, error) {
k := key(workspaceID, messageID)
cmd := d.client.B().Set().Key(k).Value("1").
Nx().
Ex(d.ttl).
Build()
resp := d.client.Do(ctx, cmd)
if err := resp.Error(); err != nil {
return false, fmt.Errorf("dedup setnx: %w", err)
}
// SET with NX returns "OK" when set, nil reply when key already exists.
if resp.IsNil() {
return false, nil
}
return true, nil
}

View File

@@ -0,0 +1,209 @@
package handler
import (
"bytes"
"encoding/json"
"errors"
"io"
"net/http"
"strconv"
"github.com/go-playground/validator/v10"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/ingest/internal/apperr"
"github.com/dbiz/cdp/ingestion/ingest/internal/middleware"
"github.com/dbiz/cdp/ingestion/ingest/internal/model"
"github.com/dbiz/cdp/ingestion/ingest/internal/service"
)
type EventHandler struct {
svc *service.IngestService
val *validator.Validate
log *zap.Logger
}
func NewEventHandler(svc *service.IngestService, log *zap.Logger) *EventHandler {
return &EventHandler{
svc: svc,
val: validator.New(),
log: log,
}
}
// ---------------------------------------------------------------------------
// Routes
// ---------------------------------------------------------------------------
// Single-event endpoints. They differ only in the `type` they force on the
// body, so they all funnel into one handler.
func (h *EventHandler) Track(w http.ResponseWriter, r *http.Request) {
h.handleSingle(w, r, model.EventTypeTrack)
}
func (h *EventHandler) Identify(w http.ResponseWriter, r *http.Request) {
h.handleSingle(w, r, model.EventTypeIdentify)
}
func (h *EventHandler) Page(w http.ResponseWriter, r *http.Request) {
h.handleSingle(w, r, model.EventTypePage)
}
func (h *EventHandler) Group(w http.ResponseWriter, r *http.Request) {
h.handleSingle(w, r, model.EventTypeGroup)
}
func (h *EventHandler) Alias(w http.ResponseWriter, r *http.Request) {
h.handleSingle(w, r, model.EventTypeAlias)
}
func (h *EventHandler) Screen(w http.ResponseWriter, r *http.Request) {
h.handleSingle(w, r, model.EventTypeScreen)
}
func (h *EventHandler) Batch(w http.ResponseWriter, r *http.Request) {
body, err := io.ReadAll(r.Body)
if err != nil {
h.writeErr(w, apperr.PayloadTooLarge("payload too large"))
return
}
var env model.BatchEnvelope
if err := json.Unmarshal(body, &env); err != nil {
h.writeErr(w, apperr.BadRequest("invalid json body", "", err))
return
}
if err := h.val.Struct(&env); err != nil {
h.writeErr(w, apperr.BadRequest("validation failed", firstField(err), err))
return
}
ictx := h.ingestCtx(r, body)
results := h.svc.IngestBatch(r.Context(), ictx, env.Batch)
// Per-event status -- 200 OK, with an array of {message_id, ok, error}.
type item struct {
MessageID string `json:"messageId"`
OK bool `json:"ok"`
Error string `json:"error,omitempty"`
Field string `json:"field,omitempty"`
}
out := make([]item, len(env.Batch))
for i, e := range env.Batch {
it := item{MessageID: e.MessageID, OK: true}
if results[i] != nil {
it.OK = false
if ae, ok := apperr.As(results[i]); ok {
it.Error = ae.Message
it.Field = ae.Field
} else {
it.Error = "internal error"
}
}
out[i] = it
}
writeJSON(w, http.StatusOK, map[string]any{"results": out})
}
// ---------------------------------------------------------------------------
// Health / Ready
// ---------------------------------------------------------------------------
func (h *EventHandler) Health(w http.ResponseWriter, r *http.Request) {
writeJSON(w, http.StatusOK, map[string]string{"status": "ok"})
}
func (h *EventHandler) Ready(w http.ResponseWriter, r *http.Request) {
// Liveness is enough for k8s readiness in this scaffold. Wire in real
// dependency checks (PG ping, Kafka ping) when needed.
writeJSON(w, http.StatusOK, map[string]string{"status": "ready"})
}
// ---------------------------------------------------------------------------
// shared helpers
// ---------------------------------------------------------------------------
func (h *EventHandler) handleSingle(w http.ResponseWriter, r *http.Request, t model.EventType) {
body, err := io.ReadAll(r.Body)
if err != nil {
h.writeErr(w, apperr.PayloadTooLarge("payload too large"))
return
}
var raw model.RawEvent
if err := json.NewDecoder(bytes.NewReader(body)).Decode(&raw); err != nil {
h.writeErr(w, apperr.BadRequest("invalid json body", "", err))
return
}
if raw.Type == "" {
raw.Type = t
}
if err := h.val.Struct(&raw); err != nil {
h.writeErr(w, apperr.BadRequest("validation failed", firstField(err), err))
return
}
ictx := h.ingestCtx(r, body)
if err := h.svc.Ingest(r.Context(), ictx, &raw); err != nil {
h.writeErr(w, err)
return
}
writeJSON(w, http.StatusOK, map[string]bool{"ok": true})
}
func (h *EventHandler) ingestCtx(r *http.Request, body []byte) service.IngestContext {
wk := middleware.WriteKeyFromCtx(r.Context())
return service.IngestContext{
WorkspaceID: wk.WorkspaceID,
SourceID: wk.SourceID,
IP: clientIP(r),
UserAgent: r.UserAgent(),
RawBody: body,
}
}
func (h *EventHandler) writeErr(w http.ResponseWriter, err error) {
if ae, ok := apperr.As(err); ok {
if ae.RetryAfter > 0 {
w.Header().Set("Retry-After", strconv.Itoa(ae.RetryAfter))
}
writeJSON(w, ae.Code, errorResponse{Error: ae.Message, Field: ae.Field})
if ae.Err != nil {
h.log.Warn("request error",
zap.Int("code", ae.Code),
zap.String("msg", ae.Message),
zap.Error(ae.Err))
}
return
}
h.log.Error("unhandled error", zap.Error(err))
writeJSON(w, http.StatusInternalServerError, errorResponse{Error: "internal server error"})
}
type errorResponse struct {
Error string `json:"error"`
Field string `json:"field,omitempty"`
}
func writeJSON(w http.ResponseWriter, status int, body any) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(status)
_ = json.NewEncoder(w).Encode(body)
}
func firstField(err error) string {
var verrs validator.ValidationErrors
if errors.As(err, &verrs) && len(verrs) > 0 {
return verrs[0].Field()
}
return ""
}
// clientIP duplicates middleware.clientIP -- intentionally small, no shared types.
func clientIP(r *http.Request) string {
if h := r.Header.Get("X-Forwarded-For"); h != "" {
return h
}
return r.RemoteAddr
}

View File

@@ -0,0 +1,110 @@
// Package kafka wraps franz-go for the ingest producer.
//
// Design notes:
// - We use ProduceSync only for DLQ writes (rare; correctness > latency).
// - Happy-path Produce is fire-and-forget: we return 200 OK before the
// ack lands. franz-go buffers internally and retries.
// - Partition key = anonymous_id for the happy topic so that all events
// for a single visitor land on the same partition (ordering for stitching).
package kafka
import (
"context"
"encoding/json"
"fmt"
"github.com/twmb/franz-go/pkg/kgo"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/ingest/internal/model"
)
type Producer struct {
client *kgo.Client
log *zap.Logger
topicIngest string
topicDLQ string
topicRetry string
}
func NewProducer(brokers []string, topicIngest, topicDLQ, topicRetry string, log *zap.Logger) (*Producer, error) {
cl, err := kgo.NewClient(
kgo.SeedBrokers(brokers...),
kgo.ProducerLinger(5_000_000), // 5ms linger -> batch small bursts
kgo.ProducerBatchCompression(kgo.ZstdCompression()),
kgo.MaxBufferedRecords(100_000),
kgo.RequiredAcks(kgo.LeaderAck()),
kgo.ClientID("cdp-ingest"),
)
if err != nil {
return nil, fmt.Errorf("kafka client: %w", err)
}
if err := cl.Ping(context.Background()); err != nil {
cl.Close()
return nil, fmt.Errorf("kafka ping: %w", err)
}
return &Producer{
client: cl,
log: log,
topicIngest: topicIngest,
topicDLQ: topicDLQ,
topicRetry: topicRetry,
}, nil
}
func (p *Producer) Close() {
p.client.Close()
}
// Produce sends an event to the happy-path topic. Fire-and-forget.
func (p *Producer) Produce(ctx context.Context, ev *model.IngestedEvent) error {
payload, err := json.Marshal(ev)
if err != nil {
return fmt.Errorf("marshal event: %w", err)
}
rec := &kgo.Record{
Topic: p.topicIngest,
Key: []byte(ev.PartitionKey()),
Value: payload,
Headers: []kgo.RecordHeader{
{Key: "workspace_id", Value: []byte(ev.WorkspaceID)},
{Key: "source_id", Value: []byte(ev.SourceID)},
{Key: "type", Value: []byte(ev.Type)},
},
}
p.client.Produce(ctx, rec, func(r *kgo.Record, err error) {
if err != nil {
p.log.Error("kafka produce failed",
zap.String("topic", r.Topic),
zap.String("message_id", ev.MessageID),
zap.Error(err))
}
})
return nil
}
// ProduceDLQ writes a failed event to the DLQ topic synchronously so we know
// it landed before responding to the user with the error.
func (p *Producer) ProduceDLQ(ctx context.Context, workspaceID, sourceID, messageID, reason, field string, raw []byte) error {
envelope := map[string]any{
"workspace_id": workspaceID,
"source_id": sourceID,
"message_id": messageID,
"reason": reason,
"field": field,
"raw_payload": string(raw),
}
payload, _ := json.Marshal(envelope)
rec := &kgo.Record{
Topic: p.topicDLQ,
Key: []byte(workspaceID),
Value: payload,
Headers: []kgo.RecordHeader{
{Key: "reason", Value: []byte(reason)},
},
}
if err := p.client.ProduceSync(ctx, rec).FirstErr(); err != nil {
return fmt.Errorf("dlq produce: %w", err)
}
return nil
}

View File

@@ -0,0 +1,193 @@
// Package middleware provides chi-compatible HTTP middleware: auth, logging,
// payload-limit, request-id, panic recovery, CORS.
package middleware
import (
"context"
"encoding/base64"
"net/http"
"runtime/debug"
"strings"
"time"
"github.com/google/uuid"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/ingest/internal/apperr"
"github.com/dbiz/cdp/ingestion/ingest/internal/model"
"github.com/dbiz/cdp/ingestion/ingest/internal/service"
)
type ctxKey string
const (
ctxKeyRequestID ctxKey = "request_id"
ctxKeyWriteKey ctxKey = "write_key"
)
// RequestID assigns a uuid v4 to each request and stores it in context.
func RequestID(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
id := r.Header.Get("X-Request-Id")
if id == "" {
id = uuid.NewString()
}
ctx := context.WithValue(r.Context(), ctxKeyRequestID, id)
w.Header().Set("X-Request-Id", id)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
func RequestIDFromCtx(ctx context.Context) string {
v, _ := ctx.Value(ctxKeyRequestID).(string)
return v
}
// Recover handles panics so a buggy handler can't take down the server.
func Recover(log *zap.Logger) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
defer func() {
if rec := recover(); rec != nil {
log.Error("panic in handler",
zap.Any("panic", rec),
zap.String("path", r.URL.Path),
zap.ByteString("stack", debug.Stack()))
http.Error(w, `{"error":"internal server error"}`, http.StatusInternalServerError)
}
}()
next.ServeHTTP(w, r)
})
}
}
// PayloadLimit caps the request body size to limitKB kilobytes.
func PayloadLimit(limitKB int) func(http.Handler) http.Handler {
max := int64(limitKB) * 1024
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
r.Body = http.MaxBytesReader(w, r.Body, max)
next.ServeHTTP(w, r)
})
}
}
// Logger logs one structured line per request.
func Logger(log *zap.Logger) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
rw := &statusRecorder{ResponseWriter: w, status: 200}
next.ServeHTTP(rw, r)
log.Info("http",
zap.String("method", r.Method),
zap.String("path", r.URL.Path),
zap.Int("status", rw.status),
zap.Int64("duration_ms", time.Since(start).Milliseconds()),
zap.String("request_id", RequestIDFromCtx(r.Context())),
zap.String("ip", clientIP(r)))
})
}
}
// CORS returns a permissive CORS handler. Browser SDKs (web tracker) require it.
func CORS(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Access-Control-Allow-Origin", "*")
w.Header().Set("Access-Control-Allow-Methods", "GET, POST, OPTIONS")
w.Header().Set("Access-Control-Allow-Headers", "Authorization, Content-Type, X-Request-Id")
w.Header().Set("Access-Control-Max-Age", "86400")
if r.Method == http.MethodOptions {
w.WriteHeader(http.StatusNoContent)
return
}
next.ServeHTTP(w, r)
})
}
// Auth resolves the write key from the request and stores it in context.
// Accepts both `Authorization: Basic <base64(key:)>` (Segment-style) and
// `Authorization: Bearer <key>`.
func Auth(s *service.AuthService) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
key, err := extractWriteKey(r)
if err != nil {
writeAuthError(w, err)
return
}
wk, err := s.Resolve(r.Context(), key)
if err != nil {
writeAuthError(w, err)
return
}
ctx := context.WithValue(r.Context(), ctxKeyWriteKey, wk)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
}
// WriteKeyFromCtx returns the resolved key set by Auth middleware.
func WriteKeyFromCtx(ctx context.Context) *model.WriteKey {
v, _ := ctx.Value(ctxKeyWriteKey).(*model.WriteKey)
return v
}
// ---------------------------------------------------------------------------
// helpers
// ---------------------------------------------------------------------------
func extractWriteKey(r *http.Request) (string, error) {
h := r.Header.Get("Authorization")
if h == "" {
return "", apperr.Unauthorized("missing Authorization header")
}
if strings.HasPrefix(h, "Bearer ") {
return strings.TrimPrefix(h, "Bearer "), nil
}
if strings.HasPrefix(h, "Basic ") {
raw, err := base64.StdEncoding.DecodeString(strings.TrimPrefix(h, "Basic "))
if err != nil {
return "", apperr.Unauthorized("invalid basic auth")
}
// Segment uses `key:` (no password). Take everything before the first colon.
s := string(raw)
if i := strings.Index(s, ":"); i >= 0 {
return s[:i], nil
}
return s, nil
}
return "", apperr.Unauthorized("unsupported auth scheme")
}
func writeAuthError(w http.ResponseWriter, err error) {
if ae, ok := apperr.As(err); ok {
http.Error(w, `{"error":"`+ae.Message+`"}`, ae.Code)
return
}
http.Error(w, `{"error":"unauthorized"}`, http.StatusUnauthorized)
}
func clientIP(r *http.Request) string {
if xff := r.Header.Get("X-Forwarded-For"); xff != "" {
if i := strings.Index(xff, ","); i >= 0 {
return strings.TrimSpace(xff[:i])
}
return strings.TrimSpace(xff)
}
if rip := r.Header.Get("X-Real-Ip"); rip != "" {
return rip
}
return r.RemoteAddr
}
type statusRecorder struct {
http.ResponseWriter
status int
}
func (s *statusRecorder) WriteHeader(code int) {
s.status = code
s.ResponseWriter.WriteHeader(code)
}

View File

@@ -0,0 +1,81 @@
// Package model holds the wire and domain types passed between layers.
package model
import (
"encoding/json"
"time"
)
// EventType is the Segment-compatible call kind.
type EventType string
const (
EventTypeTrack EventType = "track"
EventTypeIdentify EventType = "identify"
EventTypePage EventType = "page"
EventTypeGroup EventType = "group"
EventTypeAlias EventType = "alias"
EventTypeScreen EventType = "screen"
)
// RawEvent is the parsed-but-not-yet-validated payload from a client.
// We keep Properties / Traits / Context as json.RawMessage so the handler can
// pass them through to the service untouched; flattening happens in service.
type RawEvent struct {
Type EventType `json:"type" validate:"required,oneof=track identify page group alias screen"`
MessageID string `json:"messageId" validate:"required,max=128"`
AnonymousID string `json:"anonymousId" validate:"max=128"`
UserID string `json:"userId" validate:"max=128"`
GroupID string `json:"groupId" validate:"max=128"`
Event string `json:"event" validate:"max=255"`
Name string `json:"name" validate:"max=255"`
Category string `json:"category" validate:"max=255"`
Properties json.RawMessage `json:"properties"`
Traits json.RawMessage `json:"traits"`
Context json.RawMessage `json:"context"`
Timestamp *time.Time `json:"timestamp"`
SentAt *time.Time `json:"sentAt"`
}
// BatchEnvelope is the body of /batch — Segment-compatible.
type BatchEnvelope struct {
Batch []RawEvent `json:"batch" validate:"required,min=1,max=1000,dive"`
SentAt *time.Time `json:"sentAt"`
Context json.RawMessage `json:"context"`
}
// IngestedEvent is the canonical record we push onto Kafka. Flat fields,
// timestamps already normalized, payload sanitized.
type IngestedEvent struct {
WorkspaceID string `json:"workspace_id"`
SourceID string `json:"source_id"`
MessageID string `json:"message_id"`
Type EventType `json:"type"`
AnonymousID string `json:"anonymous_id,omitempty"`
UserID string `json:"user_id,omitempty"`
GroupID string `json:"group_id,omitempty"`
Event string `json:"event,omitempty"`
Name string `json:"name,omitempty"`
Category string `json:"category,omitempty"`
Properties map[string]any `json:"properties,omitempty"`
Traits map[string]any `json:"traits,omitempty"`
Context map[string]any `json:"context,omitempty"`
IP string `json:"ip,omitempty"`
UserAgent string `json:"user_agent,omitempty"`
Timestamp time.Time `json:"timestamp"`
SentAt time.Time `json:"sent_at"`
ReceivedAt time.Time `json:"received_at"`
}
// PartitionKey returns the key used for Kafka partitioning. We use
// anonymous_id to keep identity-stitching ordering per visitor.
func (e *IngestedEvent) PartitionKey() string {
if e.AnonymousID != "" {
return e.AnonymousID
}
if e.UserID != "" {
return e.UserID
}
return e.MessageID
}

View File

@@ -0,0 +1,19 @@
package model
import "time"
// WriteKey is the auth credential supplied via Authorization header.
// We never store the raw value — only its sha256 hash and a short prefix
// for display in the console.
type WriteKey struct {
ID string
WorkspaceID string
SourceID string
KeyPrefix string
Label string
RevokedAt *time.Time
LastUsedAt *time.Time
CreatedAt time.Time
}
func (k *WriteKey) Revoked() bool { return k.RevokedAt != nil }

View File

@@ -0,0 +1,102 @@
// Package ratelimit implements a Redis-backed sliding-window limiter.
//
// We use a sorted-set per workspace where the score is the unix-nano
// timestamp. On each request we:
// 1. ZREMRANGEBYSCORE -- evict entries older than window
// 2. ZCARD -- count current
// 3. if count < limit : ZADD + EXPIRE, allow
// 4. else : compute retry-after from oldest entry, deny
//
// Steps 1-3/4 are wrapped in a Lua script for atomicity.
package ratelimit
import (
"context"
"fmt"
"strconv"
"time"
"github.com/redis/rueidis"
)
type Decision struct {
Allowed bool
Remaining int
RetryAfterMS int
}
type Limiter interface {
Allow(ctx context.Context, workspaceID string, limit int, window time.Duration) (Decision, error)
}
type redisLimiter struct {
client rueidis.Client
}
func New(client rueidis.Client) Limiter {
return &redisLimiter{client: client}
}
// Lua script: KEYS[1]=zset key, ARGV[1]=now_ms, ARGV[2]=window_ms,
// ARGV[3]=limit, ARGV[4]=member (unique per request).
//
// Returns: {allowed (1/0), remaining, retry_after_ms}
const slidingWindowLua = `
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
local member = ARGV[4]
local cutoff = now - window
redis.call('ZREMRANGEBYSCORE', key, 0, cutoff)
local count = tonumber(redis.call('ZCARD', key))
if count < limit then
redis.call('ZADD', key, now, member)
redis.call('PEXPIRE', key, window)
return {1, limit - count - 1, 0}
end
local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')
local retry = window
if oldest and oldest[2] then
retry = (tonumber(oldest[2]) + window) - now
if retry < 0 then retry = 0 end
end
return {0, 0, retry}
`
func (l *redisLimiter) Allow(ctx context.Context, workspaceID string, limit int, window time.Duration) (Decision, error) {
key := "rate:" + workspaceID
now := time.Now().UnixMilli()
member := strconv.FormatInt(now, 10) + ":" + workspaceID
cmd := l.client.B().Eval().Script(slidingWindowLua).
Numkeys(1).
Key(key).
Arg(strconv.FormatInt(now, 10),
strconv.FormatInt(window.Milliseconds(), 10),
strconv.Itoa(limit),
member).
Build()
res := l.client.Do(ctx, cmd)
if err := res.Error(); err != nil {
return Decision{}, fmt.Errorf("ratelimit eval: %w", err)
}
arr, err := res.ToArray()
if err != nil || len(arr) != 3 {
return Decision{}, fmt.Errorf("ratelimit bad reply: %w", err)
}
allowed, _ := arr[0].AsInt64()
remaining, _ := arr[1].AsInt64()
retry, _ := arr[2].AsInt64()
return Decision{
Allowed: allowed == 1,
Remaining: int(remaining),
RetryAfterMS: int(retry),
}, nil
}

View File

@@ -0,0 +1,33 @@
package repo
import (
"context"
"fmt"
"time"
"github.com/jackc/pgx/v5/pgxpool"
)
// NewPool creates a pgxpool with sensible defaults for ingest workloads.
// Pool size is small because ingest is mostly cache hits — Postgres is only
// touched on cache miss (write key lookup, schema upsert).
func NewPool(ctx context.Context, dsn string) (*pgxpool.Pool, error) {
cfg, err := pgxpool.ParseConfig(dsn)
if err != nil {
return nil, fmt.Errorf("parse pg dsn: %w", err)
}
cfg.MaxConns = 16
cfg.MinConns = 2
cfg.MaxConnIdleTime = 5 * time.Minute
cfg.HealthCheckPeriod = 30 * time.Second
pool, err := pgxpool.NewWithConfig(ctx, cfg)
if err != nil {
return nil, fmt.Errorf("pg connect: %w", err)
}
if err := pool.Ping(ctx); err != nil {
pool.Close()
return nil, fmt.Errorf("pg ping: %w", err)
}
return pool, nil
}

View File

@@ -0,0 +1,61 @@
package repo
import (
"context"
"fmt"
"github.com/jackc/pgx/v5/pgxpool"
"github.com/dbiz/cdp/ingestion/ingest/internal/apperr"
)
// SchemaRepo records the data type observed for each (workspace, event_type, field)
// triple. The bulker / analytics layer uses this to detect type conflicts.
//
// In the ingest hot path we only *check* for conflict via UpsertField; the
// rebuild of the cached map is left to a background loader. We do not block
// the request waiting for upsert -- it is fire-and-forget.
type SchemaRepo interface {
// GetType returns the recorded type, or "" if the field has never been seen.
GetType(ctx context.Context, workspaceID, eventType, field string) (string, error)
// UpsertField records a new (or re-confirmed) field type.
UpsertField(ctx context.Context, workspaceID, eventType, field, dataType string) error
}
type schemaRepo struct {
db *pgxpool.Pool
}
func NewSchemaRepo(db *pgxpool.Pool) SchemaRepo {
return &schemaRepo{db: db}
}
func (r *schemaRepo) GetType(ctx context.Context, workspaceID, eventType, field string) (string, error) {
const q = `
SELECT data_type FROM schema_fields
WHERE workspace_id = $1::uuid AND event_type = $2 AND field = $3`
var t string
err := r.db.QueryRow(ctx, q, workspaceID, eventType, field).Scan(&t)
if err != nil {
// pgx.ErrNoRows → return "" with nil error so caller treats as new field
if err.Error() == "no rows in result set" {
return "", nil
}
return "", apperr.Internal(fmt.Errorf("schema get: %w", err))
}
return t, nil
}
func (r *schemaRepo) UpsertField(ctx context.Context, workspaceID, eventType, field, dataType string) error {
const q = `
INSERT INTO schema_fields (workspace_id, event_type, field, data_type)
VALUES ($1::uuid, $2, $3, $4)
ON CONFLICT (workspace_id, event_type, field) DO UPDATE
SET last_seen_at = now(),
sample_count = schema_fields.sample_count + 1`
_, err := r.db.Exec(ctx, q, workspaceID, eventType, field, dataType)
if err != nil {
return apperr.Internal(fmt.Errorf("schema upsert: %w", err))
}
return nil
}

View File

@@ -0,0 +1,66 @@
package repo
import (
"context"
"crypto/sha256"
"encoding/hex"
"errors"
"fmt"
"github.com/jackc/pgx/v5"
"github.com/jackc/pgx/v5/pgxpool"
"github.com/dbiz/cdp/ingestion/ingest/internal/apperr"
"github.com/dbiz/cdp/ingestion/ingest/internal/model"
)
// WriteKeyRepo loads WriteKey records by their plaintext value.
// The plaintext is hashed before the lookup; the DB only stores hashes.
type WriteKeyRepo interface {
FindByPlaintext(ctx context.Context, plaintext string) (*model.WriteKey, error)
MarkUsed(ctx context.Context, id string) error
}
type writeKeyRepo struct {
db *pgxpool.Pool
}
func NewWriteKeyRepo(db *pgxpool.Pool) WriteKeyRepo {
return &writeKeyRepo{db: db}
}
func hashKey(plaintext string) string {
sum := sha256.Sum256([]byte(plaintext))
return hex.EncodeToString(sum[:])
}
func (r *writeKeyRepo) FindByPlaintext(ctx context.Context, plaintext string) (*model.WriteKey, error) {
const q = `
SELECT id::text, workspace_id::text, source_id::text,
key_prefix, COALESCE(label, ''),
revoked_at, last_used_at, created_at
FROM write_keys
WHERE key_hash = $1`
row := r.db.QueryRow(ctx, q, hashKey(plaintext))
var k model.WriteKey
err := row.Scan(&k.ID, &k.WorkspaceID, &k.SourceID,
&k.KeyPrefix, &k.Label,
&k.RevokedAt, &k.LastUsedAt, &k.CreatedAt)
if errors.Is(err, pgx.ErrNoRows) {
return nil, apperr.Unauthorized("invalid write key")
}
if err != nil {
return nil, apperr.Internal(fmt.Errorf("writekey lookup: %w", err))
}
return &k, nil
}
func (r *writeKeyRepo) MarkUsed(ctx context.Context, id string) error {
const q = `UPDATE write_keys SET last_used_at = now() WHERE id = $1`
_, err := r.db.Exec(ctx, q, id)
if err != nil {
return apperr.Internal(fmt.Errorf("writekey mark used: %w", err))
}
return nil
}

View File

@@ -0,0 +1,92 @@
// Package schema turns nested JSON objects into flat key/value maps and
// classifies field types for conflict detection.
//
// Rules:
// - keys are joined with "_" : {"a": {"b": 1}} -> {"a_b": 1}
// - arrays are preserved as-is and not descended into
// - keys are sanitized: lowercase, non-[a-z0-9_] replaced with "_"
package schema
import (
"strings"
"unicode"
)
type DataType string
const (
TypeString DataType = "string"
TypeNumber DataType = "number"
TypeBoolean DataType = "boolean"
TypeObject DataType = "object"
TypeArray DataType = "array"
TypeTimestamp DataType = "timestamp"
TypeNull DataType = "null"
)
// Flatten flattens nested objects under a snake_case prefix.
// Returns a new map, never mutates input.
func Flatten(in map[string]any) map[string]any {
out := make(map[string]any, len(in))
for k, v := range in {
flattenInto(out, sanitize(k), v)
}
return out
}
func flattenInto(out map[string]any, prefix string, v any) {
switch x := v.(type) {
case map[string]any:
if len(x) == 0 {
out[prefix] = x
return
}
for k, child := range x {
flattenInto(out, prefix+"_"+sanitize(k), child)
}
default:
out[prefix] = v
}
}
// sanitize replaces characters outside [a-z0-9_] with "_" and lowercases.
// Leading underscores are kept; trailing underscores are trimmed.
func sanitize(k string) string {
if k == "" {
return k
}
var b strings.Builder
b.Grow(len(k))
for _, r := range k {
switch {
case unicode.IsLetter(r):
b.WriteRune(unicode.ToLower(r))
case unicode.IsDigit(r) || r == '_':
b.WriteRune(r)
default:
b.WriteRune('_')
}
}
return strings.TrimRight(b.String(), "_")
}
// Classify maps a Go value (from json.Unmarshal) to a DataType.
func Classify(v any) DataType {
switch x := v.(type) {
case nil:
return TypeNull
case bool:
return TypeBoolean
case float64, float32, int, int32, int64, uint, uint32, uint64:
return TypeNumber
case string:
_ = x
return TypeString
case []any:
return TypeArray
case map[string]any:
return TypeObject
default:
return TypeString
}
}

View File

@@ -0,0 +1,53 @@
package schema
import (
"testing"
"github.com/stretchr/testify/assert"
)
func TestFlatten_NestedObject(t *testing.T) {
in := map[string]any{
"user": map[string]any{
"id": "u_1",
"profile": map[string]any{"age": 30, "name": "Phuoc"},
},
"plan": "pro",
}
got := Flatten(in)
assert.Equal(t, "u_1", got["user_id"])
assert.Equal(t, 30, got["user_profile_age"])
assert.Equal(t, "Phuoc", got["user_profile_name"])
assert.Equal(t, "pro", got["plan"])
}
func TestFlatten_SanitizesKeys(t *testing.T) {
in := map[string]any{
"User Email": "x@y",
"price.usd": 9.99,
"meta!": map[string]any{"X-Y": 1},
}
got := Flatten(in)
assert.Equal(t, "x@y", got["user_email"])
assert.Equal(t, 9.99, got["price_usd"])
assert.Equal(t, 1, got["meta_x_y"])
}
func TestFlatten_PreservesArrays(t *testing.T) {
in := map[string]any{
"tags": []any{"a", "b"},
}
got := Flatten(in)
arr, ok := got["tags"].([]any)
assert.True(t, ok)
assert.Equal(t, 2, len(arr))
}
func TestClassify(t *testing.T) {
assert.Equal(t, TypeString, Classify("hi"))
assert.Equal(t, TypeNumber, Classify(float64(1.5)))
assert.Equal(t, TypeBoolean, Classify(true))
assert.Equal(t, TypeNull, Classify(nil))
assert.Equal(t, TypeArray, Classify([]any{1, 2}))
assert.Equal(t, TypeObject, Classify(map[string]any{}))
}

View File

@@ -0,0 +1,115 @@
package service
import (
"context"
"sync"
"time"
"github.com/redis/rueidis"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/ingest/internal/apperr"
"github.com/dbiz/cdp/ingestion/ingest/internal/model"
"github.com/dbiz/cdp/ingestion/ingest/internal/repo"
)
// AuthService resolves a plaintext Write Key into the workspace + source it
// authorizes for. Lookups are cached in process AND in Redis. Pub/sub
// invalidation lets the console revoke a key and have it propagate within
// the cache TTL.
type AuthService struct {
repo repo.WriteKeyRepo
redis rueidis.Client
log *zap.Logger
ttl time.Duration
mu sync.RWMutex
cache map[string]cachedKey
}
type cachedKey struct {
key *model.WriteKey
expires time.Time
}
const (
redisKeyWritePrefix = "wk:" // wk:{plaintext} -> json
pubsubChannel = "wk:invalidate"
)
func NewAuthService(r repo.WriteKeyRepo, redis rueidis.Client, ttl time.Duration, log *zap.Logger) *AuthService {
s := &AuthService{
repo: r,
redis: redis,
log: log,
ttl: ttl,
cache: make(map[string]cachedKey),
}
go s.watchInvalidations()
return s
}
// Resolve returns the WriteKey for a plaintext token. Cached.
func (s *AuthService) Resolve(ctx context.Context, plaintext string) (*model.WriteKey, error) {
if plaintext == "" {
return nil, apperr.Unauthorized("missing write key")
}
// in-process cache
s.mu.RLock()
if entry, ok := s.cache[plaintext]; ok && time.Now().Before(entry.expires) {
s.mu.RUnlock()
if entry.key.Revoked() {
return nil, apperr.Unauthorized("write key revoked")
}
return entry.key, nil
}
s.mu.RUnlock()
// fall through to DB (Redis cache is optional and intentionally skipped
// here -- the in-process map is plenty fast; Redis is only used for the
// pub/sub invalidation channel below)
k, err := s.repo.FindByPlaintext(ctx, plaintext)
if err != nil {
return nil, err
}
if k.Revoked() {
return nil, apperr.Unauthorized("write key revoked")
}
s.mu.Lock()
s.cache[plaintext] = cachedKey{key: k, expires: time.Now().Add(s.ttl)}
s.mu.Unlock()
return k, nil
}
// Invalidate clears the cache entry for one key. Called by the console via
// pub/sub when a key is revoked.
func (s *AuthService) Invalidate(plaintext string) {
s.mu.Lock()
delete(s.cache, plaintext)
s.mu.Unlock()
}
func (s *AuthService) watchInvalidations() {
if s.redis == nil {
return
}
ctx := context.Background()
err := s.redis.Receive(ctx, s.redis.B().Subscribe().Channel(pubsubChannel).Build(),
func(msg rueidis.PubSubMessage) {
s.Invalidate(msg.Message)
s.log.Info("write key invalidated via pubsub", zap.String("prefix", maskKey(msg.Message)))
})
if err != nil {
s.log.Warn("pubsub subscribe ended", zap.Error(err))
}
}
// maskKey returns the first 8 chars + "***" for safe logging.
func maskKey(k string) string {
if len(k) <= 8 {
return "***"
}
return k[:8] + "***"
}

View File

@@ -0,0 +1,223 @@
package service
import (
"context"
"encoding/json"
"time"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/ingest/internal/apperr"
"github.com/dbiz/cdp/ingestion/ingest/internal/dedup"
"github.com/dbiz/cdp/ingestion/ingest/internal/model"
"github.com/dbiz/cdp/ingestion/ingest/internal/ratelimit"
"github.com/dbiz/cdp/ingestion/ingest/internal/repo"
"github.com/dbiz/cdp/ingestion/ingest/internal/schema"
)
// Producer is the small surface IngestService needs from the Kafka client.
// Defined here so it can be stubbed in tests without pulling in franz-go.
type Producer interface {
Produce(ctx context.Context, ev *model.IngestedEvent) error
ProduceDLQ(ctx context.Context, workspaceID, sourceID, messageID, reason, field string, raw []byte) error
}
// IngestService is the core pipeline: validate → ratelimit → timestamp normalize
// → late-check → dedup → flatten → schema-conflict → push Kafka.
type IngestService struct {
producer Producer
limiter ratelimit.Limiter
dedup dedup.Dedup
schema repo.SchemaRepo
log *zap.Logger
lateAfter time.Duration
}
// IngestDeps groups dependencies for cleaner construction.
type IngestDeps struct {
Producer Producer
Limiter ratelimit.Limiter
Dedup dedup.Dedup
Schema repo.SchemaRepo
Log *zap.Logger
LateAfter time.Duration
}
func NewIngestService(d IngestDeps) *IngestService {
return &IngestService{
producer: d.Producer,
limiter: d.Limiter,
dedup: d.Dedup,
schema: d.Schema,
log: d.Log,
lateAfter: d.LateAfter,
}
}
// IngestContext carries per-request data set by middleware.
type IngestContext struct {
WorkspaceID string
SourceID string
IP string
UserAgent string
RawBody []byte // original body, used for DLQ payload
}
// Ingest runs the full pipeline for a single event.
func (s *IngestService) Ingest(ctx context.Context, ictx IngestContext, raw *model.RawEvent) error {
now := time.Now().UTC()
// 3. rate limit per workspace
dec, err := s.limiter.Allow(ctx, ictx.WorkspaceID, defaultTierLimit, time.Second)
if err != nil {
return apperr.Internal(err)
}
if !dec.Allowed {
retry := (dec.RetryAfterMS / 1000) + 1
return apperr.TooManyRequests(retry)
}
// 4-5. timestamps + late-event check
sentAt := derefTime(raw.SentAt, now)
if now.Sub(sentAt) > s.lateAfter {
return apperr.UnprocessableEntity("event too old (>24h)")
}
timestamp := derefTime(raw.Timestamp, sentAt)
// 6. dedup
if raw.MessageID == "" {
return apperr.BadRequest("messageId required", "messageId", nil)
}
fresh, err := s.dedup.CheckAndSet(ctx, ictx.WorkspaceID, raw.MessageID)
if err != nil {
return apperr.Internal(err)
}
if !fresh {
// silently drop -- duplicate message
return nil
}
// 7. flatten properties / traits / context
props, err := decodeAndFlatten(raw.Properties)
if err != nil {
_ = s.toDLQ(ctx, ictx, raw, "properties_invalid_json", "properties")
return apperr.BadRequest("properties is not valid JSON object", "properties", err)
}
traits, err := decodeAndFlatten(raw.Traits)
if err != nil {
_ = s.toDLQ(ctx, ictx, raw, "traits_invalid_json", "traits")
return apperr.BadRequest("traits is not valid JSON object", "traits", err)
}
contextMap, err := decodeAndFlatten(raw.Context)
if err != nil {
// context is best-effort: keep going without it
contextMap = nil
}
// 8. schema validation -- type conflict detection (best-effort, async upsert)
if err := s.checkSchema(ctx, ictx.WorkspaceID, string(raw.Type), props); err != nil {
_ = s.toDLQ(ctx, ictx, raw, "schema_conflict", "")
return err
}
ev := &model.IngestedEvent{
WorkspaceID: ictx.WorkspaceID,
SourceID: ictx.SourceID,
MessageID: raw.MessageID,
Type: raw.Type,
AnonymousID: raw.AnonymousID,
UserID: raw.UserID,
GroupID: raw.GroupID,
Event: raw.Event,
Name: raw.Name,
Category: raw.Category,
Properties: props,
Traits: traits,
Context: contextMap,
IP: ictx.IP,
UserAgent: ictx.UserAgent,
Timestamp: timestamp,
SentAt: sentAt,
ReceivedAt: now,
}
// 9. push Kafka -- fire-and-forget
if err := s.producer.Produce(ctx, ev); err != nil {
return apperr.Internal(err)
}
return nil
}
// IngestBatch processes a batch envelope; each failure is recorded but the
// good events still ship. Returns the first error so the handler can pick a
// status; in practice batch endpoints return 200 with per-event status.
func (s *IngestService) IngestBatch(ctx context.Context, ictx IngestContext, batch []model.RawEvent) []error {
errs := make([]error, len(batch))
for i := range batch {
errs[i] = s.Ingest(ctx, ictx, &batch[i])
}
return errs
}
// ---------------------------------------------------------------------------
// helpers
// ---------------------------------------------------------------------------
const defaultTierLimit = 100 // rps; per-tier override comes from workspace.tier later
func derefTime(p *time.Time, fallback time.Time) time.Time {
if p == nil || p.IsZero() {
return fallback
}
return p.UTC()
}
func decodeAndFlatten(raw json.RawMessage) (map[string]any, error) {
if len(raw) == 0 {
return nil, nil
}
var m map[string]any
if err := json.Unmarshal(raw, &m); err != nil {
return nil, err
}
if m == nil {
return nil, nil
}
return schema.Flatten(m), nil
}
// checkSchema looks up the recorded type per (workspace, event_type, field)
// and rejects with 400 on conflict. New fields are recorded asynchronously --
// we do not block the request waiting on the DB write.
func (s *IngestService) checkSchema(ctx context.Context, workspaceID, eventType string, props map[string]any) error {
for field, v := range props {
dt := string(schema.Classify(v))
if dt == string(schema.TypeNull) {
continue
}
existing, err := s.schema.GetType(ctx, workspaceID, eventType, field)
if err != nil {
// soft-fail: don't block ingest on schema DB errors
s.log.Warn("schema lookup failed", zap.String("field", field), zap.Error(err))
continue
}
if existing == "" {
// fire-and-forget upsert
go func(f, t string) {
if err := s.schema.UpsertField(context.Background(), workspaceID, eventType, f, t); err != nil {
s.log.Warn("schema upsert failed", zap.String("field", f), zap.Error(err))
}
}(field, dt)
continue
}
if existing != dt {
return apperr.BadRequest("schema type conflict", field, nil)
}
}
return nil
}
func (s *IngestService) toDLQ(ctx context.Context, ictx IngestContext, raw *model.RawEvent, reason, field string) error {
return s.producer.ProduceDLQ(ctx,
ictx.WorkspaceID, ictx.SourceID, raw.MessageID, reason, field, ictx.RawBody)
}

View File

@@ -0,0 +1,150 @@
package service
import (
"context"
"encoding/json"
"testing"
"time"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"go.uber.org/zap"
"github.com/dbiz/cdp/ingestion/ingest/internal/apperr"
"github.com/dbiz/cdp/ingestion/ingest/internal/model"
"github.com/dbiz/cdp/ingestion/ingest/internal/ratelimit"
)
// ---------------------------------------------------------------------------
// Stubs -- enough surface to drive the IngestService without spinning Kafka
// or Redis. We exercise the pipeline branches: late event, dedup hit, schema
// conflict, happy path.
// ---------------------------------------------------------------------------
type fakeLimiter struct{ allow bool }
func (f *fakeLimiter) Allow(_ context.Context, _ string, _ int, _ time.Duration) (ratelimit.Decision, error) {
if f.allow {
return ratelimit.Decision{Allowed: true, Remaining: 99}, nil
}
return ratelimit.Decision{Allowed: false, RetryAfterMS: 500}, nil
}
type fakeDedup struct{ fresh bool }
func (f *fakeDedup) CheckAndSet(_ context.Context, _, _ string) (bool, error) { return f.fresh, nil }
type fakeSchema struct {
stored map[string]string
}
func (f *fakeSchema) GetType(_ context.Context, _, _, field string) (string, error) {
if t, ok := f.stored[field]; ok {
return t, nil
}
return "", nil
}
func (f *fakeSchema) UpsertField(_ context.Context, _, _, field, dt string) error {
if f.stored == nil {
f.stored = map[string]string{}
}
f.stored[field] = dt
return nil
}
// fakeProducer captures pushes so tests can assert side effects.
type fakeProducer struct {
produced []*model.IngestedEvent
dlq []string // reason values
}
func (f *fakeProducer) Produce(_ context.Context, ev *model.IngestedEvent) error {
f.produced = append(f.produced, ev)
return nil
}
func (f *fakeProducer) ProduceDLQ(_ context.Context, _, _, _, reason, _ string, _ []byte) error {
f.dlq = append(f.dlq, reason)
return nil
}
// ---------------------------------------------------------------------------
func newSvc(t *testing.T, limiter *fakeLimiter, dedupSvc *fakeDedup, sch *fakeSchema) (*IngestService, *fakeProducer) {
t.Helper()
prod := &fakeProducer{}
return &IngestService{
producer: prod,
limiter: limiter,
dedup: dedupSvc,
schema: sch,
log: zap.NewNop(),
lateAfter: 24 * time.Hour,
}, prod
}
func TestIngest_RateLimited(t *testing.T) {
svc, _ := newSvc(t, &fakeLimiter{allow: false}, &fakeDedup{fresh: true}, &fakeSchema{})
err := svc.Ingest(context.Background(), IngestContext{WorkspaceID: "ws"},
&model.RawEvent{Type: model.EventTypeTrack, MessageID: "m1"})
ae, ok := apperr.As(err)
require.True(t, ok)
assert.Equal(t, 429, ae.Code)
assert.Greater(t, ae.RetryAfter, 0)
}
func TestIngest_LateEvent(t *testing.T) {
svc, _ := newSvc(t, &fakeLimiter{allow: true}, &fakeDedup{fresh: true}, &fakeSchema{})
old := time.Now().Add(-48 * time.Hour)
err := svc.Ingest(context.Background(), IngestContext{WorkspaceID: "ws"},
&model.RawEvent{Type: model.EventTypeTrack, MessageID: "m1", SentAt: &old})
ae, ok := apperr.As(err)
require.True(t, ok)
assert.Equal(t, 422, ae.Code)
}
func TestIngest_DuplicateMessageSilentlyDropped(t *testing.T) {
svc, prod := newSvc(t, &fakeLimiter{allow: true}, &fakeDedup{fresh: false}, &fakeSchema{})
err := svc.Ingest(context.Background(), IngestContext{WorkspaceID: "ws"},
&model.RawEvent{Type: model.EventTypeTrack, MessageID: "m1"})
assert.NoError(t, err)
assert.Empty(t, prod.produced, "duplicate must not be produced")
}
func TestIngest_SchemaConflict(t *testing.T) {
svc, prod := newSvc(t, &fakeLimiter{allow: true}, &fakeDedup{fresh: true},
&fakeSchema{stored: map[string]string{"price": "string"}})
props, _ := json.Marshal(map[string]any{"price": 9.99})
err := svc.Ingest(context.Background(),
IngestContext{WorkspaceID: "ws"},
&model.RawEvent{
Type: model.EventTypeTrack,
MessageID: "m1",
Properties: props,
})
ae, ok := apperr.As(err)
require.True(t, ok)
assert.Equal(t, 400, ae.Code)
assert.Equal(t, "price", ae.Field)
assert.Equal(t, []string{"schema_conflict"}, prod.dlq)
assert.Empty(t, prod.produced)
}
func TestIngest_HappyPath(t *testing.T) {
svc, prod := newSvc(t, &fakeLimiter{allow: true}, &fakeDedup{fresh: true}, &fakeSchema{})
props, _ := json.Marshal(map[string]any{"plan": "pro"})
err := svc.Ingest(context.Background(),
IngestContext{WorkspaceID: "ws", SourceID: "src", IP: "1.1.1.1"},
&model.RawEvent{
Type: model.EventTypeTrack,
MessageID: "m1",
AnonymousID: "anon-1",
Event: "Signed Up",
Properties: props,
})
require.NoError(t, err)
require.Len(t, prod.produced, 1)
ev := prod.produced[0]
assert.Equal(t, "ws", ev.WorkspaceID)
assert.Equal(t, "anon-1", ev.PartitionKey())
assert.Equal(t, "pro", ev.Properties["plan"])
}

View File

@@ -0,0 +1,18 @@
FROM node:20-bookworm-slim AS build
WORKDIR /app
COPY package.json ./
# isolated-vm needs a build toolchain
RUN apt-get update && apt-get install -y --no-install-recommends \
python3 make g++ \
&& npm install --omit=dev \
&& apt-get purge -y python3 make g++ && apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/*
COPY src ./src
FROM node:20-bookworm-slim
WORKDIR /app
COPY --from=build /app /app
ENV NODE_ENV=production
EXPOSE 3401
USER node
CMD ["node", "src/index.js"]

45
ingestion/rotor/README.md Normal file
View File

@@ -0,0 +1,45 @@
# rotor
CDP JS Functions runner. Executes user-supplied JavaScript inside V8 isolates
(via `isolated-vm`), enforcing a memory + wall-clock limit per invocation.
## Endpoints
| Method | Path | Body | Notes |
|--------|------|------|-------|
| `POST` | `/v1/run` | `{ code, event }` | run ad-hoc code on one event |
| `POST` | `/v1/transform` | `{ workspace_id, function, event }` | run a registered function |
| `POST` | `/v1/functions` | `{ workspace_id, slug, code }` | upsert function code (admin) |
| `DELETE` | `/v1/functions/:workspace/:slug` | — | invalidate cache entry |
| `GET` | `/health` | — | liveness |
| `GET` | `/ready` | — | readiness |
## User function contract
The submitted code must define a global function `transform(event)`. The
function can return:
- `event` (possibly mutated) — emit one event
- `null` / `undefined` — drop the event
- `Array<event>` — fan-out into multiple events
Sync return only (no `async`). The runner enforces:
- Memory limit: `ROTOR_ISOLATE_MEMORY_MB` (default 128MB)
- CPU/wall limit: `ROTOR_FUNCTION_TIMEOUT_MS` (default 2000ms)
## Local dev
```bash
npm install
npm run dev
```
Smoke test:
```bash
curl -s -X POST localhost:3401/v1/run -H 'content-type: application/json' -d '{
"code": "function transform(event) { event.properties = { tagged: true }; return event; }",
"event": { "message_id": "m1", "workspace_id": "ws", "type": "track" }
}'
```

View File

@@ -0,0 +1,23 @@
{
"name": "cdp-rotor",
"version": "0.1.0",
"private": true,
"description": "CDP JS Functions runner -- V8 isolates",
"type": "module",
"main": "src/index.js",
"scripts": {
"dev": "node --enable-source-maps src/index.js",
"start": "node src/index.js",
"test": "node --test test"
},
"engines": {
"node": ">=20"
},
"dependencies": {
"fastify": "^4.28.1",
"isolated-vm": "^5.0.1",
"pino": "^9.4.0",
"zod": "^3.23.8"
},
"devDependencies": {}
}

View File

@@ -0,0 +1,126 @@
// HTTP API for rotor.
//
// Endpoints
// POST /v1/run run an ad-hoc function against a single event
// POST /v1/transform run a registered function (by workspace/slug)
// POST /v1/functions upsert function code (admin)
// DELETE /v1/functions/:workspace/:slug invalidate
// GET /health liveness
// GET /ready readiness
//
// All responses are JSON. Errors return { error, kind, field? }.
import Fastify from 'fastify';
import { z } from 'zod';
import { FunctionError, IsolateRunner } from '../runtime/isolate.js';
import { Registry } from '../registry/registry.js';
const eventSchema = z.object({
workspace_id: z.string(),
source_id: z.string().optional(),
message_id: z.string(),
type: z.enum(['track', 'identify', 'page', 'group', 'alias', 'screen']),
anonymous_id: z.string().optional(),
user_id: z.string().optional(),
event: z.string().optional(),
properties: z.record(z.unknown()).optional(),
traits: z.record(z.unknown()).optional(),
context: z.record(z.unknown()).optional(),
timestamp: z.string().optional(),
sent_at: z.string().optional(),
received_at: z.string().optional(),
}).passthrough();
const runSchema = z.object({
code: z.string().min(1).max(64 * 1024),
event: eventSchema,
});
const transformSchema = z.object({
workspace_id: z.string(),
function: z.string(),
event: eventSchema,
});
const upsertSchema = z.object({
workspace_id: z.string(),
slug: z.string(),
code: z.string().min(1).max(64 * 1024),
version: z.number().int().optional(),
});
export function buildServer({ config, logger }) {
const runner = new IsolateRunner(config.isolate);
const registry = new Registry();
const app = Fastify({
logger: logger,
bodyLimit: 1 * 1024 * 1024, // 1MB cap on incoming code/payloads
disableRequestLogging: false,
});
app.get('/health', async () => ({ status: 'ok' }));
app.get('/ready', async () => ({ status: 'ready' }));
app.post('/v1/run', async (req, reply) => {
const parsed = runSchema.safeParse(req.body);
if (!parsed.success) {
reply.status(400);
return { error: 'invalid request', issues: parsed.error.flatten() };
}
try {
const result = await runner.run(parsed.data.code, parsed.data.event);
return { result };
} catch (err) {
return handleFnError(reply, err);
}
});
app.post('/v1/transform', async (req, reply) => {
const parsed = transformSchema.safeParse(req.body);
if (!parsed.success) {
reply.status(400);
return { error: 'invalid request', issues: parsed.error.flatten() };
}
const { workspace_id, function: slug, event } = parsed.data;
const entry = await registry.get(workspace_id, slug);
if (!entry) {
reply.status(404);
return { error: 'function not found', workspace_id, slug };
}
try {
const result = await runner.run(entry.code, event);
return { result, version: entry.version };
} catch (err) {
return handleFnError(reply, err);
}
});
app.post('/v1/functions', async (req, reply) => {
const parsed = upsertSchema.safeParse(req.body);
if (!parsed.success) {
reply.status(400);
return { error: 'invalid request', issues: parsed.error.flatten() };
}
const { workspace_id, slug, code, version } = parsed.data;
registry.set(workspace_id, slug, code, version ?? 1);
return { ok: true };
});
app.delete('/v1/functions/:workspace/:slug', async (req) => {
registry.invalidate(req.params.workspace, req.params.slug);
return { ok: true };
});
return { app, registry, runner };
}
function handleFnError(reply, err) {
if (err instanceof FunctionError) {
const code = err.kind === 'timeout' ? 422 : 400;
reply.status(code);
return { error: err.message, kind: err.kind };
}
reply.status(500);
return { error: 'internal error' };
}

View File

@@ -0,0 +1,9 @@
// Runtime config loaded from env. Defaults mirror .env.example.
export const config = {
port: parseInt(process.env.ROTOR_PORT ?? '3401', 10),
logLevel: process.env.ROTOR_LOG_LEVEL ?? 'info',
isolate: {
memoryLimitMB: parseInt(process.env.ROTOR_ISOLATE_MEMORY_MB ?? '128', 10),
timeoutMs: parseInt(process.env.ROTOR_FUNCTION_TIMEOUT_MS ?? '2000', 10),
},
};

View File

@@ -0,0 +1,34 @@
// rotor entry point -- starts the Fastify HTTP API.
import pino from 'pino';
import { config } from './config.js';
import { buildServer } from './api/server.js';
const logger = pino({ level: config.logLevel });
const { app } = buildServer({ config, logger });
async function start() {
try {
await app.listen({ host: '0.0.0.0', port: config.port });
logger.info({ port: config.port }, 'rotor listening');
} catch (err) {
logger.error({ err }, 'rotor failed to start');
process.exit(1);
}
}
function shutdown(signal) {
logger.info({ signal }, 'shutdown signal received');
app.close()
.then(() => process.exit(0))
.catch((err) => {
logger.error({ err }, 'shutdown error');
process.exit(1);
});
}
process.on('SIGINT', () => shutdown('SIGINT'));
process.on('SIGTERM', () => shutdown('SIGTERM'));
start();

View File

@@ -0,0 +1,41 @@
// Function registry -- an in-memory cache of (workspace_id, slug) -> code.
//
// The console writes function code into Postgres; rotor loads it lazily on
// first miss and refreshes on pub/sub invalidation.
//
// For this scaffold we keep it dumb: a Map you can preload via /api/admin
// or set directly in tests. Replace `loader` with a real PG loader when the
// console exists.
export class Registry {
/**
* @param {{ loader?: (workspaceId: string, slug: string) => Promise<string|null> }} opts
*/
constructor(opts = {}) {
this.loader = opts.loader ?? (async () => null);
/** @type {Map<string, { code: string, version: number }>} */
this.cache = new Map();
}
key(workspaceId, slug) { return `${workspaceId}:${slug}`; }
async get(workspaceId, slug) {
const k = this.key(workspaceId, slug);
if (this.cache.has(k)) return this.cache.get(k);
const code = await this.loader(workspaceId, slug);
if (code == null) return null;
const entry = { code, version: 1 };
this.cache.set(k, entry);
return entry;
}
set(workspaceId, slug, code, version = 1) {
this.cache.set(this.key(workspaceId, slug), { code, version });
}
invalidate(workspaceId, slug) {
this.cache.delete(this.key(workspaceId, slug));
}
}

View File

@@ -0,0 +1,97 @@
// V8 isolate wrapper for running user-supplied JS functions safely.
//
// Each invocation:
// 1. Build a fresh isolate + context (cheap to recycle for cold safety).
// 2. Compile the user code once per function (cached in registry).
// 3. Call `transform(event)` with a deep-copied event payload.
// 4. Receive a return value (deep-copied back) within `timeoutMs`.
//
// Failures (compile error / runtime error / timeout) are surfaced as
// FunctionError so the caller can route the event to DLQ.
import ivm from 'isolated-vm';
export class FunctionError extends Error {
constructor(kind, message, cause) {
super(message);
this.name = 'FunctionError';
this.kind = kind; // 'compile' | 'runtime' | 'timeout' | 'oom'
this.cause = cause;
}
}
export class IsolateRunner {
constructor({ memoryLimitMB, timeoutMs }) {
this.memoryLimitMB = memoryLimitMB;
this.timeoutMs = timeoutMs;
}
/**
* Run `code` against `event` and return the transformed value.
* `code` must export a function named `transform` -- e.g.:
*
* function transform(event) {
* event.properties.hashed_email = sha256(event.user_id);
* return event;
* }
*
* The function may return:
* - the event (possibly mutated)
* - null -- drop the event
* - array -- fan-out into multiple events
*/
async run(code, event) {
const isolate = new ivm.Isolate({ memoryLimit: this.memoryLimitMB });
try {
const context = await isolate.createContext();
const jail = context.global;
await jail.set('global', jail.derefInto());
let script;
try {
script = await isolate.compileScript(buildHarness(code));
} catch (err) {
throw new FunctionError('compile', err.message, err);
}
try {
await script.run(context, { timeout: this.timeoutMs });
} catch (err) {
if (err.message?.includes('Script execution timed out')) {
throw new FunctionError('timeout', `function timed out after ${this.timeoutMs}ms`, err);
}
if (err.message?.includes('Isolate was disposed')) {
throw new FunctionError('oom', 'isolate ran out of memory', err);
}
throw new FunctionError('runtime', err.message, err);
}
const fn = await context.global.get('__cdp_transform__', { reference: true });
const result = await fn.apply(undefined, [new ivm.ExternalCopy(event).copyInto()], {
timeout: this.timeoutMs,
result: { copy: true },
});
return result;
} finally {
isolate.dispose();
}
}
}
/**
* Wrap user code so we can call it deterministically. The harness:
* - injects a `console.log` shim that drops output (we'll add capture later)
* - exposes `transform` on the global as `__cdp_transform__`
*/
function buildHarness(code) {
return `
const console = {
log: () => {}, warn: () => {}, error: () => {}, info: () => {},
};
${code}
if (typeof transform !== 'function') {
throw new Error('user code must define a global function named "transform"');
}
global.__cdp_transform__ = transform;
`;
}

View File

@@ -0,0 +1,51 @@
import { test } from 'node:test';
import assert from 'node:assert/strict';
import { IsolateRunner, FunctionError } from '../src/runtime/isolate.js';
test('passes event through a noop transform', async () => {
const r = new IsolateRunner({ memoryLimitMB: 64, timeoutMs: 1000 });
const code = `
function transform(event) {
return event;
}
`;
const got = await r.run(code, { message_id: 'm1', type: 'track' });
assert.equal(got.message_id, 'm1');
});
test('mutates event properties', async () => {
const r = new IsolateRunner({ memoryLimitMB: 64, timeoutMs: 1000 });
const code = `
function transform(event) {
event.properties = event.properties || {};
event.properties.tagged = true;
return event;
}
`;
const got = await r.run(code, { message_id: 'm1', type: 'track' });
assert.equal(got.properties.tagged, true);
});
test('returns null to drop event', async () => {
const r = new IsolateRunner({ memoryLimitMB: 64, timeoutMs: 1000 });
const code = `function transform(event) { return null; }`;
const got = await r.run(code, { message_id: 'm1', type: 'track' });
assert.equal(got, null);
});
test('rejects code without transform()', async () => {
const r = new IsolateRunner({ memoryLimitMB: 64, timeoutMs: 1000 });
await assert.rejects(
() => r.run(`const x = 1;`, { message_id: 'm1', type: 'track' }),
(err) => err instanceof FunctionError && err.kind === 'runtime',
);
});
test('times out infinite loops', async () => {
const r = new IsolateRunner({ memoryLimitMB: 64, timeoutMs: 100 });
const code = `function transform(event) { while (true) {} return event; }`;
await assert.rejects(
() => r.run(code, { message_id: 'm1', type: 'track' }),
(err) => err instanceof FunctionError && err.kind === 'timeout',
);
});