92 lines
3.8 KiB
Markdown
92 lines
3.8 KiB
Markdown
# CDP Analytics (data-layer)
|
|
|
|
Read-side of the self-hosted CDP platform. Queries events written by
|
|
`cdp-ingestion`, computes traits and segments, and activates segments
|
|
to external tools.
|
|
|
|
## Services
|
|
|
|
| Service | Lang | Port | Role |
|
|
|-----------|-------------|------|------|
|
|
| `api` | Go | 4000 | Query API, Profile API, Custom SQL sandbox |
|
|
| `workers` | Go (river) | 4001 | Computed Traits, Segment refresh, Reverse ETL |
|
|
| `console` | React + Vite| 4002 | Analytics UI |
|
|
|
|
## Quick start
|
|
|
|
Shared infra (Postgres / Redis / ClickHouse) is brought up by the ingestion
|
|
repo. Start it there first:
|
|
|
|
```bash
|
|
cd ../ingestion && make up
|
|
```
|
|
|
|
Then in this directory:
|
|
|
|
```bash
|
|
make migrate/up # apply analytics PostgreSQL migrations
|
|
make clickhouse/up # apply analytics ClickHouse DDL (if any)
|
|
|
|
# First time only:
|
|
(cd api && go mod tidy)
|
|
(cd workers && go mod tidy)
|
|
(cd console && npm install)
|
|
|
|
make run/api # start API on :4000
|
|
make run/workers # start worker on :4001
|
|
make run/console # start console on :4002
|
|
```
|
|
|
|
## Endpoints (shipped)
|
|
|
|
All endpoints below require an `X-Workspace-Id` header (UUID). Workspace
|
|
membership / auth is a TODO; the header is the only authority for now.
|
|
|
|
| Method | Path | Priority | Description |
|
|
|--------|-------------------------------|----------|-------------|
|
|
| GET | `/health` | - | Liveness |
|
|
| GET | `/ready` | - | Readiness |
|
|
| POST | `/query/events` | P0 | Filter raw events on one of `events_track/identify/page/group` |
|
|
| POST | `/query/sql` | P0 | Arbitrary `SELECT` on ClickHouse (read-only user) |
|
|
| GET | `/profiles/{id}` | P0 | Unified profile lookup |
|
|
| GET | `/profiles/{id}/events` | P0 | Merged event timeline for the profile's `user_id` |
|
|
| POST | `/queries` | P0 | Create saved query |
|
|
| GET | `/queries` | P0 | List saved queries |
|
|
| GET | `/queries/{id}` | P0 | Get saved query |
|
|
| PUT | `/queries/{id}` | P0 | Update saved query |
|
|
| DELETE | `/queries/{id}` | P0 | Delete saved query |
|
|
| POST | `/query/funnel` | P1 | Windowed funnel via ClickHouse `windowFunnel()` |
|
|
| POST | `/query/retention` | P1 | Cohort retention via ClickHouse `retention()` |
|
|
| POST | `/query/session` | P1 | Session bucketing with inactivity timeout |
|
|
|
|
Cache: 60s default for query results, 30s for profile lookups. Per-query
|
|
TTLs configurable via `ANALYTICS_CACHE_TTL_*_SECONDS`. Custom SQL is never
|
|
cached.
|
|
|
|
## Console pages (shipped)
|
|
|
|
- **Explore** — wired to `/query/events`
|
|
- **Custom SQL** — wired to `/query/sql`
|
|
- Profiles / Funnels / Retention / Segments / Traits — placeholders
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
make test # unit tests (no containers)
|
|
make test/integration # repo-layer integration tests (testcontainers)
|
|
```
|
|
|
|
## Caveats
|
|
|
|
- The `profiles` table is **read-only contract from cdp-ingestion**; it does
|
|
not exist yet in the ingestion migrations. `repo/profile_repo.go` assumes
|
|
`profiles(id, workspace_id, user_id, anonymous_ids, traits, first_seen_at,
|
|
last_seen_at)` — align before shipping.
|
|
- `/query/sql` ideally runs against a `analytics_ro` ClickHouse user with
|
|
`SELECT`-only grants. If that account does not exist the server falls back
|
|
to the main connection and logs a warning — fix before production.
|
|
- Auth: every request must supply `X-Workspace-Id`. Wire the console's
|
|
workspace store to a real session/JWT once the auth scheme is decided.
|
|
|
|
See [CLAUDE_analytics.md](./CLAUDE_analytics.md) for the full design contract.
|