Files
cdp/data-layer
2026-05-25 13:38:20 +07:00
..
ok
2026-05-25 13:38:20 +07:00
ok
2026-05-25 13:38:20 +07:00
ok
2026-05-25 13:38:20 +07:00
2026-05-25 08:38:26 +07:00
2026-05-25 10:16:31 +07:00
2026-05-25 08:38:26 +07:00
2026-05-24 22:59:24 +07:00
2026-05-25 10:16:31 +07:00
2026-05-25 08:38:26 +07:00

CDP Analytics (data-layer)

Read-side of the self-hosted CDP platform. Queries events written by cdp-ingestion, computes traits and segments, and activates segments to external tools.

Services

Service Lang Port Role
api Go 4000 Query API, Profile API, Custom SQL sandbox
workers Go (river) 4001 Computed Traits, Segment refresh, Reverse ETL
console React + Vite 4002 Analytics UI

Quick start

Shared infra (Postgres / Redis / ClickHouse) is brought up by the ingestion repo. Start it there first:

cd ../ingestion && make up

Then in this directory:

make migrate/up            # apply analytics PostgreSQL migrations
make clickhouse/up         # apply analytics ClickHouse DDL (if any)

# First time only:
(cd api && go mod tidy)
(cd workers && go mod tidy)
(cd console && npm install)

make run/api               # start API on :4000
make run/workers           # start worker on :4001
make run/console           # start console on :4002

Endpoints (shipped)

All endpoints below require an X-Workspace-Id header (UUID). Workspace membership / auth is a TODO; the header is the only authority for now.

Method Path Priority Description
GET /health - Liveness
GET /ready - Readiness
POST /query/events P0 Filter raw events on one of events_track/identify/page/group
POST /query/sql P0 Arbitrary SELECT on ClickHouse (read-only user)
GET /profiles/{id} P0 Unified profile lookup
GET /profiles/{id}/events P0 Merged event timeline for the profile's user_id
POST /queries P0 Create saved query
GET /queries P0 List saved queries
GET /queries/{id} P0 Get saved query
PUT /queries/{id} P0 Update saved query
DELETE /queries/{id} P0 Delete saved query
POST /query/funnel P1 Windowed funnel via ClickHouse windowFunnel()
POST /query/retention P1 Cohort retention via ClickHouse retention()
POST /query/session P1 Session bucketing with inactivity timeout

Cache: 60s default for query results, 30s for profile lookups. Per-query TTLs configurable via ANALYTICS_CACHE_TTL_*_SECONDS. Custom SQL is never cached.

Console pages (shipped)

  • Explore — wired to /query/events
  • Custom SQL — wired to /query/sql
  • Profiles / Funnels / Retention / Segments / Traits — placeholders

Testing

make test                  # unit tests (no containers)
make test/integration      # repo-layer integration tests (testcontainers)

Caveats

  • The profiles table is read-only contract from cdp-ingestion; it does not exist yet in the ingestion migrations. repo/profile_repo.go assumes profiles(id, workspace_id, user_id, anonymous_ids, traits, first_seen_at, last_seen_at) — align before shipping.
  • /query/sql ideally runs against a analytics_ro ClickHouse user with SELECT-only grants. If that account does not exist the server falls back to the main connection and logs a warning — fix before production.
  • Auth: every request must supply X-Workspace-Id. Wire the console's workspace store to a real session/JWT once the auth scheme is decided.

See CLAUDE_analytics.md for the full design contract.