# Docker Deployment This directory contains the Docker runtime for Islandflow VPS deployments. Docker remains the default rollout path before native cutover and the rollback path after cutover. The repo-root `deploy` helper can target either: - `--runtime docker` for this Docker Compose stack - `--runtime native` for the host-native Bun + systemd rollout described in `deployment/native/README.md` The public VPS edge remains Nginx Proxy Manager. Docker fallback can be reached either through the shared Docker network service names or the host ports published by this stack. It is separate from the repo-root `docker-compose.yml`, which remains the lightweight local infra stack for development. Do not run the repo-root `docker-compose.yml` on the VPS. On the live server that creates a second compose project with host-published NATS, ClickHouse, and Redis ports that are not part of the supported production topology. ## What this stack does - Builds and runs the full Islandflow stack with Docker Compose. - Publishes `web` and `api` to host ports, bound to loopback by default. - Runs ClickHouse, Redis, and NATS JetStream with persistent host data under `ISLANDFLOW_DATA_ROOT`. - Runs the core runtime services: `ingest-options`, `ingest-equities`, `compute`, `candles`, `api`, and `web`. - Keeps `replay` opt-in through a Compose profile, because the current replay service starts immediately when the container is enabled. ## Files - `deployment/docker/docker-compose.yml`: production-style stack for a single VPS - `deployment/docker/Dockerfile.service`: shared Bun runtime image for most services - `deployment/docker/Dockerfile.ingest-options`: Bun runtime plus Python dependencies for Databento and IBKR adapters - `deployment/docker/Dockerfile.web`: multi-stage build for the Next.js web app - `deployment/docker/workspace-root/`: deployment-specific workspace snapshot (`package.json`, `tsconfig.base.json`, `bun.lock`) used by Docker builds - `deployment/docker/clickhouse/listen.xml`: forces ClickHouse to listen on IPv4 for other containers on the Docker network - `deployment/docker/.env.example`: container-oriented environment template ## Prerequisites - A Linux VPS with Docker Engine and Docker Compose v2 installed - Enough RAM for ClickHouse plus the Bun services Optional: - A DNS record pointed at the VPS - Any reverse proxy or load balancer you prefer - Alpaca, Databento, or IBKR credentials if you are not using the synthetic adapters ## First deployment 1. Copy the env template: ```bash cd deployment/docker cp .env.example .env ``` 2. Edit `.env`. Important defaults: - `NATS_URL`, `CLICKHOUSE_URL`, and `REDIS_URL` should stay on the internal container hostnames unless you intentionally split infra out. - `ISLANDFLOW_DATA_ROOT=/var/lib/islandflow` matches the native infra data root used by the VPS cutover helpers. - `OPTIONS_INGEST_ADAPTER=synthetic` and `EQUITIES_INGEST_ADAPTER=synthetic` are the safest first-boot settings. - `WEB_BIND_IP=127.0.0.1` and `API_BIND_IP=127.0.0.1` keep the published ports local to the host by default. - `WEB_HOST_PORT=3000` and `API_HOST_PORT=4000` control the host-side published ports. - `NEXT_PUBLIC_API_URL=` (empty, the default in `.env.example`) fits same-origin mode where your edge layer proxies API paths from the app origin to the API host port. - `NEXT_PUBLIC_API_URL=https://api.example.com` fits a two-origin setup where the browser reaches the API on a separate public origin. 3. Build and start the stack: ```bash docker compose build api web compute candles ingest-options ingest-equities docker compose up -d ``` If you are updating an existing deployment that already has failing `api` restart loops, do a full recreate so the ClickHouse config mount and dependency changes are applied cleanly: ```bash docker compose down docker compose build api web compute candles ingest-options ingest-equities docker compose up -d --force-recreate ``` 4. Confirm the containers are healthy: ```bash docker compose ps docker compose logs -f api web compute candles ingest-options ingest-equities ``` 5. Open the app. With the default loopback binding: - UI: `http://127.0.0.1:3000/` - Health check: `http://127.0.0.1:4000/health` If you want direct remote access without a reverse proxy, change `WEB_BIND_IP` and `API_BIND_IP` to `0.0.0.0` and restrict exposure with your firewall. ## Access patterns ### Direct host-port mode Use this when you want Docker alone to serve the app: - set `WEB_BIND_IP=0.0.0.0` - set `API_BIND_IP=0.0.0.0` - optionally change `WEB_HOST_PORT` / `API_HOST_PORT` - point DNS or clients at the host directly ### Reverse proxy mode If you use Caddy, Nginx, Traefik, a cloud load balancer, or another edge layer, proxy to the published host ports from this stack. The repo does not require or manage any specific proxy anymore. Supported routing modes: 1. Two-origin mode - `app.` -> host `WEB_HOST_PORT` - `api.` -> host `API_HOST_PORT` - Build web with `NEXT_PUBLIC_API_URL=https://api.`. 2. Same-origin mode - Build web with `NEXT_PUBLIC_API_URL=` (empty). - Point `app.` at the web host port. - Proxy these API routes from the app origin to the API host port: - `/ws/*`, `/replay/*`, `/prints/*`, `/joins/*`, `/nbbo/*`, `/dark/*`, `/flow/*`, `/candles/*`, `/history/*` Enable websocket support on whichever host serves `/ws/*`. For the current live Nginx Proxy Manager setup behind `flow.deltaisland.io`, keep the API location regex durable in the proxy host advanced config or API, not by hand-editing generated files under `/data/nginx/proxy_host/`. The route matcher should include history: ```nginx ^/(ws|replay|prints|joins|nbbo|dark|flow|candles|history)/ ``` ## Replay service Replay is disabled by default in this stack. Start it only when you want it: ```bash docker compose --profile replay up -d replay ``` Stop it again: ```bash docker compose stop replay ``` ## Adapter notes ### Synthetic mode This is the easiest way to smoke-test the deployment: - `OPTIONS_INGEST_ADAPTER=synthetic` - `EQUITIES_INGEST_ADAPTER=synthetic` ### Alpaca mode Set the adapter values and credentials in `.env`: - `OPTIONS_INGEST_ADAPTER=alpaca` - `EQUITIES_INGEST_ADAPTER=alpaca` - `ALPACA_KEY_ID=...` - `ALPACA_SECRET_KEY=...` ### Databento mode The `ingest-options` image in this deployment includes Python plus the repo’s sidecar dependencies, so Databento can run without a custom image. Set the Databento env vars in `.env`, especially: - `OPTIONS_INGEST_ADAPTER=databento` - `DATABENTO_API_KEY=...` - `DATABENTO_START=...` ### IBKR mode If TWS or IB Gateway is running on the VPS host, the default `.env.example` already points `IBKR_HOST` at `host.docker.internal`, and the Compose stack adds the required host gateway mapping. If IBKR is running somewhere else, change: - `IBKR_HOST` - `IBKR_PORT` ## Updating the deployment This deployment installs dependencies from `deployment/docker/workspace-root/bun.lock` rather than the repo-root lockfile. When dependencies change in any workspace used by Docker builds, refresh and validate the deployment snapshot first: ```bash bun run sync:docker-workspace bun run check:docker-workspace ``` Then validate the VPS build path: ```bash cd deployment/docker docker compose build web ``` ### Faster Docker builds The app images are structured so dependency installation is isolated from source code changes: - Docker first copies `package.json`, `bun.lock`, `tsconfig.base.json`, and workspace `package.json` files. - `bun install --frozen-lockfile` runs in a cacheable layer with a BuildKit Bun cache mount. - Source from `apps`, `services`, and `packages` is copied only after dependencies are installed. - `ingest-options` also installs its Python sidecar dependencies from `services/ingest-options/py/requirements.txt` before source copy, using a BuildKit pip cache mount. That means normal TypeScript edits should reuse dependency layers. The first build after a fresh server checkout, Docker cache cleanup, dependency change, or Python requirement change can still be slow; later deploys should spend their time on changed source and the specific service images being rolled out. BuildKit cache mounts require a modern Docker Engine with Dockerfile frontend support. Docker Compose v2 on the VPS path enables this by default. ## Safe rollouts on `152.53.80.229` The current live VPS uses Nginx Proxy Manager as the outer edge. Before native cutover, NPM routes Islandflow traffic to Docker service names. During cutover, `deployment/native/switch-npm-edge.sh native` retargets only the Islandflow proxy hosts to the NPM bridge gateway IP so NPM can reach native host ports. If needed, override the detected target with `ISLANDFLOW_NATIVE_HOST=`. The deploy helper also warns if it detects a second compose project named `islandflow` on the server, because that usually means the repo-root local-infra stack was started on the VPS by mistake. The checked-in deploy helper normally runs from your local repo checkout and targets: - SSH host: `delta@152.53.80.229` - SSH key: `~/.ssh/delta_ed25519` by default - Live repo checkout: `/home/delta/islandflow` - Live compose directory: `/home/delta/islandflow/deployment/docker` If you run `./deploy` from `/home/delta/islandflow` on the VPS itself, it now executes the remote steps locally instead of SSHing back into the same machine. You can still force SSH with `DEPLOY_FORCE_SSH=1`, or override the key path with `DEPLOY_SSH_KEY_PATH=/path/to/key`. It preserves the current Docker Compose project and avoids destructive cleanup on the server. ### Deploy `origin/main` ```bash ./deploy main ./deploy main --runtime docker ``` This flow: - fetches `origin` locally and shows the local branch state - checks the server checkout before switching anything - stops if the server has tracked local modifications - allows the known untracked tarball at `deployment/docker/signal-cli-0.14.3-Linux-native.tar.gz` - runs `git switch main`, `git pull --ff-only origin main`, `docker compose build api web compute candles ingest-options ingest-equities`, and `docker compose up -d` - verifies the stack with `docker compose ps`, recent service logs, container-local health checks, and public HTTPS checks ### Deploy the current local branch ```bash ./deploy current-branch ./deploy current-branch --runtime docker ``` Alias: ```bash ./deploy current branch ``` This flow: - requires a clean local working tree so you only deploy committed state - pushes the current local branch to `origin` - uses `git push -u origin ` automatically when the branch has no upstream yet - switches the server checkout to that same branch and keeps it there until you intentionally move it back - runs the same rebuild and verification steps as `main` ### Partial Docker rollouts Examples: ```bash ./deploy main --runtime docker --web-only ./deploy main --runtime docker --api-only ./deploy current-branch --runtime docker --services-only ./deploy main --runtime docker --workers-only ./deploy main --runtime docker --fast ./deploy main --runtime docker --web-only --no-build ``` Scoped Docker deploys now build only the selected image set and then restart only those services: - `--web-only`: `docker compose build web`, then `docker compose up -d web` - `--api-only`: `docker compose build api`, then `docker compose up -d api` - `--services-only`: builds and restarts `api`, `compute`, `candles`, `ingest-options`, and `ingest-equities` - `--workers-only`: builds and restarts `compute`, `candles`, `ingest-options`, and `ingest-equities` without touching `web` or `api` - `--fast`: when no explicit scope flag is given, treats the deploy as `--services-only` and skips the public API route suite for quicker completion. It still runs remote service health checks. Use `--no-build` only when the image is already correct and you need Compose to recreate or restart containers, such as after changing server-side environment values that do not affect a Next.js build-time variable. Do not use `--no-build` for dependency changes, application source changes, or `NEXT_PUBLIC_*` changes. ### Escalation path Use force recreate only when a normal refresh does not update the services cleanly: ```bash ./deploy main --runtime docker --force-recreate ./deploy current-branch --runtime docker --force-recreate ``` ### Return the server to `main` If the live checkout is on a branch deploy and you want normal production tracking again: ```bash ./deploy main ./deploy main --runtime docker ``` The helper always does the final public verification against: - `https://flow.deltaisland.io` It also verifies API health from inside the `api` container during the remote verification step. If you intentionally run a separate public API origin, add an extra public API check by exporting `DEPLOY_PUBLIC_API_HEALTH_URL` before running the deploy: ```bash DEPLOY_PUBLIC_API_HEALTH_URL=https://api.example.com/health ./deploy main ``` Same-origin deployments should leave that unset unless the edge layer exposes a public API health route on purpose. ## Manual server fallback If you need to run the rollout steps manually over SSH, use the same live checkout and compose directory. Avoid `git clean -fd`, `git reset --hard`, or other destructive cleanup during normal app rollouts. Deploy `main` manually: ```bash ssh -i ~/.ssh/delta_ed25519 -o IdentitiesOnly=yes delta@152.53.80.229 cd /home/delta/islandflow git fetch origin git switch main git pull --ff-only origin main cd /home/delta/islandflow/deployment/docker docker compose build api web compute candles ingest-options ingest-equities docker compose up -d ``` Deploy the current branch manually: ```bash git push -u origin # omit -u if upstream already exists ssh -i ~/.ssh/delta_ed25519 -o IdentitiesOnly=yes delta@152.53.80.229 cd /home/delta/islandflow git fetch origin git switch || git switch -c --track origin/ git pull --ff-only origin cd /home/delta/islandflow/deployment/docker docker compose build api web compute candles ingest-options ingest-equities docker compose up -d ``` If you changed only env values for the Bun services on the server: ```bash cd /home/delta/islandflow/deployment/docker docker compose up -d ``` ## JetStream retention rollout JetStream in this stack is the live event buffer between ingest, compute, candles, replay, and API services. ClickHouse remains the durable history layer; JetStream should stay bounded enough to protect the VPS during normal live operation. Why redeploy alone is not enough for old streams: - Older streams keep the retention settings they were created with. - A code deploy only helps new streams unless something explicitly reconciles existing stream configs. - This repo now includes both startup reconciliation and a manual audit/apply command so live streams can be corrected in place without deleting them. Target retention baseline: - Raw streams: `60m`, `512 MiB` - Derived streams: `12h`, `256 MiB` Audit current stream caps from a running service container: ```bash cd deployment/docker docker compose exec api bun packages/bus/src/reconcile-streams.ts --check ``` Apply in-place reconciliation: ```bash cd deployment/docker docker compose exec api bun packages/bus/src/reconcile-streams.ts --apply ``` Verify the rollout: 1. Re-run `--check` and require all lines to report `✓`. 2. Inspect service logs for any `structural-mismatch` line or reconciliation failure. 3. Confirm the production `.env` keeps these values: - `STREAM_RAW_MAX_AGE_MS=3600000` - `STREAM_RAW_MAX_BYTES=536870912` - `STREAM_DERIVED_MAX_AGE_MS=43200000` - `STREAM_DERIVED_MAX_BYTES=268435456` 4. Compare post-rollout `docker stats --no-stream` with the pre-rollout baseline and watch JetStream storage stabilize under the tighter caps. If any stream reports a structural mismatch, stop the rollout. Do not purge or recreate streams under this procedure; capture the stream name and mismatch details for follow-up. If you changed `NEXT_PUBLIC_API_URL` or `NEXT_PUBLIC_NBBO_MAX_AGE_MS`, rebuild the web image because those are public Next.js build-time values: ```bash cd /home/delta/islandflow/deployment/docker docker compose build web docker compose up -d web ``` ## Backups and persistence Persistent data lives in Docker volumes: - `clickhouse-data` - `redis-data` - `nats-data` Before destructive maintenance, back up those volumes or the underlying Docker data directory for the host. ## Shutdown Stop everything while keeping data: ```bash docker compose down ``` Stop everything and remove volumes too: ```bash docker compose down -v ``` Only use `-v` if you intentionally want to wipe ClickHouse, Redis, and JetStream state. ## Known caveats - The root `.env.example` still contains a `REPLAY_ENABLED` comment, but the current replay service does not read that variable. Use the Compose replay profile instead. - `web` and `api` bind to loopback by default. External access requires changing the bind IPs or placing a reverse proxy in front of the published host ports. - Some hosts disable IPv6 inside containers; the bundled ClickHouse config pins `listen_host` to `0.0.0.0` so the API can reach ClickHouse reliably over Docker networking. - The stack assumes a single-node VPS deployment. If you later split infra or add external managed services, update the three core connection URLs in `.env`. ## Smoke checks After the stack is up: - `docker compose ps` should show healthy `api`, `web`, `clickhouse`, and `redis` services. - `curl http://127.0.0.1:4000/health` should return a healthy response on the server. - `curl -I http://127.0.0.1:3000/` should return a successful HTTP status on the server. - In two-origin mode, browser requests should target `https://api./...` and live feeds should use `wss://api./ws/...`. - In same-origin mode, browser requests should target `https://app./...` for API paths and live feeds should use `wss://app./ws/...`. - In same-origin mode, `bun run check:public-api-routes` should pass for `/prints/options`, `/history/options`, `/replay/options`, `/nbbo/options`, and `/ws/live`.