merge main into nextjs upgrade

This commit is contained in:
dirtydishes 2026-05-19 14:47:43 -04:00
commit 171cf52518
40 changed files with 2355 additions and 131 deletions

View file

@ -0,0 +1,23 @@
.git
.github
.DS_Store
.bun
.tmp
node_modules
dist
coverage
logs
apps/web/.next
.env
.env.*
session-ses_*.md
token-usage-output.txt
signal-cli-*.tar.gz
*.tar
*.tar.gz
*.tgz
*.zip
__pycache__
.pytest_cache
!.env.example
!**/.env.example

View file

@ -4,8 +4,10 @@ NATS_URL=nats://nats:4222
CLICKHOUSE_URL=http://clickhouse:8123
CLICKHOUSE_DATABASE=default
REDIS_URL=redis://redis:6379
ISLANDFLOW_DATA_ROOT=/var/lib/islandflow
API_PORT=4000
API_HOST=0.0.0.0
API_BIND_IP=127.0.0.1
API_HOST_PORT=4000
WEB_BIND_IP=127.0.0.1

View file

@ -60,4 +60,4 @@ COPY --from=build /app/packages ./packages
EXPOSE 3000
CMD ["bun", "run", "--cwd", "apps/web", "start"]
CMD ["bun", "run", "--cwd", "apps/web", "start", "--", "-H", "0.0.0.0", "-p", "3000"]

View file

@ -2,12 +2,12 @@
This directory contains the Docker runtime for Islandflow VPS deployments.
Docker remains the default and recommended server rollout path, but the repo-root `deploy` helper can now target either:
Docker remains the default rollout path before native cutover and the rollback path after cutover. The repo-root `deploy` helper can target either:
- `--runtime docker` for this Docker Compose stack
- `--runtime native` for an experimental host-native Bun + systemd rollout described in `deployment/native/README.md`
- `--runtime native` for the host-native Bun + systemd rollout described in `deployment/native/README.md`
The repo no longer ships or supports a separate `deployment/npm` stack. If you want a reverse proxy, point it at the host ports published by this stack.
The public VPS edge remains Nginx Proxy Manager. Docker fallback can be reached either through the shared Docker network service names or the host ports published by this stack.
It is separate from the repo-root `docker-compose.yml`, which remains the lightweight local infra stack for development.
@ -17,7 +17,7 @@ Do not run the repo-root `docker-compose.yml` on the VPS. On the live server tha
- Builds and runs the full Islandflow stack with Docker Compose.
- Publishes `web` and `api` to host ports, bound to loopback by default.
- Runs ClickHouse, Redis, and NATS JetStream with persistent Docker volumes.
- Runs ClickHouse, Redis, and NATS JetStream with persistent host data under `ISLANDFLOW_DATA_ROOT`.
- Runs the core runtime services: `ingest-options`, `ingest-equities`, `compute`, `candles`, `api`, and `web`.
- Keeps `replay` opt-in through a Compose profile, because the current replay service starts immediately when the container is enabled.
@ -56,6 +56,7 @@ cp .env.example .env
Important defaults:
- `NATS_URL`, `CLICKHOUSE_URL`, and `REDIS_URL` should stay on the internal container hostnames unless you intentionally split infra out.
- `ISLANDFLOW_DATA_ROOT=/var/lib/islandflow` matches the native infra data root used by the VPS cutover helpers.
- `OPTIONS_INGEST_ADAPTER=synthetic` and `EQUITIES_INGEST_ADAPTER=synthetic` are the safest first-boot settings.
- `WEB_BIND_IP=127.0.0.1` and `API_BIND_IP=127.0.0.1` keep the published ports local to the host by default.
- `WEB_HOST_PORT=3000` and `API_HOST_PORT=4000` control the host-side published ports.
@ -213,17 +214,19 @@ BuildKit cache mounts require a modern Docker Engine with Dockerfile frontend su
## Safe rollouts on `152.53.80.229`
The current live VPS uses Nginx Proxy Manager on the shared Docker network and routes public traffic to the Docker `web` and `api` containers by container name. Because of that, this Docker path remains the operationally correct default for the live server today.
The current live VPS uses Nginx Proxy Manager as the outer edge. Before native cutover, NPM routes Islandflow traffic to Docker service names. During cutover, `deployment/native/switch-npm-edge.sh native` retargets only the Islandflow proxy hosts to the NPM bridge gateway IP so NPM can reach native host ports. If needed, override the detected target with `ISLANDFLOW_NATIVE_HOST=<host-ip>`.
The deploy helper also warns if it detects a second compose project named `islandflow` on the server, because that usually means the repo-root local-infra stack was started on the VPS by mistake.
The checked-in deploy helper is meant to run from your local repo checkout, not from the VPS shell. It always targets:
The checked-in deploy helper normally runs from your local repo checkout and targets:
- SSH host: `delta@152.53.80.229`
- SSH key: `~/.ssh/delta_ed25519`
- SSH key: `~/.ssh/delta_ed25519` by default
- Live repo checkout: `/home/delta/islandflow`
- Live compose directory: `/home/delta/islandflow/deployment/docker`
If you run `./deploy` from `/home/delta/islandflow` on the VPS itself, it now executes the remote steps locally instead of SSHing back into the same machine. You can still force SSH with `DEPLOY_FORCE_SSH=1`, or override the key path with `DEPLOY_SSH_KEY_PATH=/path/to/key`.
It preserves the current Docker Compose project and avoids destructive cleanup on the server.
### Deploy `origin/main`
@ -271,6 +274,7 @@ Examples:
./deploy main --runtime docker --web-only
./deploy main --runtime docker --api-only
./deploy current-branch --runtime docker --services-only
./deploy main --runtime docker --workers-only
./deploy main --runtime docker --fast
./deploy main --runtime docker --web-only --no-build
```
@ -280,6 +284,7 @@ Scoped Docker deploys now build only the selected image set and then restart onl
- `--web-only`: `docker compose build web`, then `docker compose up -d web`
- `--api-only`: `docker compose build api`, then `docker compose up -d api`
- `--services-only`: builds and restarts `api`, `compute`, `candles`, `ingest-options`, and `ingest-equities`
- `--workers-only`: builds and restarts `compute`, `candles`, `ingest-options`, and `ingest-equities` without touching `web` or `api`
- `--fast`: when no explicit scope flag is given, treats the deploy as `--services-only` and skips the public API route suite for quicker completion. It still runs remote service health checks.
Use `--no-build` only when the image is already correct and you need Compose to recreate or restart containers, such as after changing server-side environment values that do not affect a Next.js build-time variable. Do not use `--no-build` for dependency changes, application source changes, or `NEXT_PUBLIC_*` changes.

View file

@ -42,6 +42,8 @@ services:
init: true
expose:
- "3000"
ports:
- "${WEB_BIND_IP:-127.0.0.1}:${WEB_HOST_PORT:-3000}:3000"
networks:
- default
- shared
@ -64,8 +66,13 @@ services:
api:
<<: *service-common
command: ["services/api/src/index.ts"]
environment:
LOG_LEVEL: ${LOG_LEVEL:-warn}
API_HOST: 0.0.0.0
expose:
- "4000"
ports:
- "${API_BIND_IP:-127.0.0.1}:${API_HOST_PORT:-4000}:4000"
networks:
- default
- shared
@ -132,7 +139,7 @@ services:
soft: 262144
hard: 262144
volumes:
- clickhouse-data:/var/lib/clickhouse
- ${ISLANDFLOW_DATA_ROOT:-/var/lib/islandflow}/clickhouse:/var/lib/clickhouse
- ./clickhouse/listen.xml:/etc/clickhouse-server/config.d/listen.xml:ro
healthcheck:
test:
@ -150,7 +157,7 @@ services:
restart: unless-stopped
command: ["redis-server", "--appendonly", "yes"]
volumes:
- redis-data:/data
- ${ISLANDFLOW_DATA_ROOT:-/var/lib/islandflow}/redis:/data
healthcheck:
test:
[
@ -168,14 +175,9 @@ services:
restart: unless-stopped
command: ["-js", "-sd", "/data"]
volumes:
- nats-data:/data
- ${ISLANDFLOW_DATA_ROOT:-/var/lib/islandflow}/nats:/data
networks:
shared:
external: true
name: ${NPM_SHARED_NETWORK:-npm-shared}
volumes:
clickhouse-data:
redis-data:
nats-data:

View file

@ -1,29 +1,167 @@
# Native Deployment
This directory documents the experimental host-native Islandflow rollout path used by:
This directory documents the host-native Islandflow rollout path used by:
```bash
./deploy main --runtime native
./deploy current-branch --runtime native
```
This runtime is intended for faster server iteration during the transition away from Docker-only app rollouts. It is not the recommended path for the current production VPS, which still uses Nginx Proxy Manager to reach the Docker `web` and `api` containers by container name on the shared Docker network. Local development should still prefer:
## Current operating model
- Docker for infra (`bun run dev:infra`)
- native Bun services (`bun run dev:services`)
- native Next.js web (`bun run dev:web`)
Native runtime is now intended for a phased VPS cutover. Docker remains the supported rollback runtime, but Docker and native app services must not own the same Islandflow scope at the same time because the workers and API use durable JetStream consumers.
Today, the recommended split is:
- **Nginx Proxy Manager** remains the public `:80/:443` edge
- **Native system services** own NATS, Redis, and ClickHouse after infra cutover
- **Native user services** own `web`, `api`, and workers after app cutover
- **Docker Compose** remains available as the rollback runtime
- local development stays:
- Docker infra: `bun run dev:infra`
- native backend services: `bun run dev:services`
- native web: `bun run dev:web`
## What native deploy means here
The checked-in `deploy` helper assumes:
- the live repo checkout is still `/home/delta/islandflow`
- the live repo checkout is `/home/delta/islandflow`
- Bun is installed on the VPS
- app processes are managed by `systemd`
- infrastructure services such as NATS, ClickHouse, and Redis are already reachable from the host
- app processes are managed by `systemd --user`
- infrastructure services such as NATS, ClickHouse, and Redis are reachable from the host
- the web app runs from `apps/web` and is served with `next start -p 3000`
The deploy script updates the repo checkout, optionally runs `bun install --frozen-lockfile`, optionally rebuilds the web app, restarts the target systemd units, and then verifies the services locally on the VPS plus through the public app URL.
The deploy script updates the repo checkout, optionally runs `bun install --frozen-lockfile`, optionally rebuilds the web app, restarts the target user units, verifies local health, and then runs public verification when the selected scope includes the public edge.
## Live audit status on 2026-05-18
The plan assumptions were audited on the VPS:
- `bun` is installed and available at `/home/delta/.bun/bin/bun`
- `systemctl --user` is available and the `delta` user has lingering enabled
- `/home/delta/islandflow/.env` exists
- public `https://flow.deltaisland.io/replay/options` routing is healthy again
- the previously reported duplicate `islandflow` compose project is not currently present in `docker compose ls`
- native Islandflow user units were not installed at the start of the audit; this change now provides and installs the checked-in user unit files, but they remain disabled until an operator enables a scope intentionally
That means native worker deploy support is now provisioned on the host, but native runtime should still be enabled scope-by-scope rather than started wholesale.
## Checked-in native ops assets
### Infra system units
Checked-in system service units and config live under:
- `deployment/native/systemd/system/islandflow-nats.service`
- `deployment/native/systemd/system/islandflow-redis.service`
- `deployment/native/systemd/system/islandflow-clickhouse.service`
- `deployment/native/config/redis.conf`
- `deployment/native/config/clickhouse-listen.xml`
Install and start them on the VPS with:
```bash
./deployment/native/bootstrap-infra.sh
```
Or install and start manually:
```bash
sudo ./deployment/native/install-infra-units.sh
sudo ./deployment/native/start-infra.sh
./deployment/native/check-native-infra.sh
```
The native infra services bind to loopback and use stable host data paths:
- NATS JetStream: `/var/lib/islandflow/nats`
- Redis: `/var/lib/islandflow/redis`
- ClickHouse: `/var/lib/islandflow/clickhouse`
The Docker fallback compose file uses the same `ISLANDFLOW_DATA_ROOT` default of `/var/lib/islandflow`, so rollback can preserve durable state when only one runtime is active.
### User unit templates
Checked-in unit files live under:
- `deployment/native/systemd/user/islandflow-web.service`
- `deployment/native/systemd/user/islandflow-api.service`
- `deployment/native/systemd/user/islandflow-compute.service`
- `deployment/native/systemd/user/islandflow-candles.service`
- `deployment/native/systemd/user/islandflow-ingest-options.service`
- `deployment/native/systemd/user/islandflow-ingest-equities.service`
These are written for the current VPS layout:
- repo root: `/home/delta/islandflow`
- Bun binary: `/home/delta/.bun/bin/bun`
- env file: `/home/delta/islandflow/.env`
### Install the units
```bash
./deployment/native/install-user-units.sh
./deployment/native/install-user-units.sh workers
systemctl --user start islandflow-compute.service
```
Install script behavior:
- copies the checked-in unit files into `~/.config/systemd/user`
- reloads the user systemd daemon
- enables only the scope you explicitly request
- defaults to installing without enabling anything yet
### Smoke test helper
```bash
./deployment/native/check-native-health.sh workers
./deployment/native/check-native-health.sh services
./deployment/native/check-native-health.sh full
```
This validates:
- native infra health for `full`, `api`, `services`, and `workers`
- `systemctl --user is-active` for the selected units
- local API health at `http://127.0.0.1:4000/health` when API scope is included
- local web health at `http://127.0.0.1:3000/` when web scope is included
### App cutover and edge switch helpers
```bash
./deployment/native/cutover.sh full
./deployment/native/switch-npm-edge.sh native
./deployment/native/full-rollback.sh
```
The edge switch helper updates the Nginx Proxy Manager database entries for `flow.deltaisland.io` and `api.flow.deltaisland.io`, preserving the same-origin Islandflow API location matcher:
```nginx
^/(ws|replay|prints|joins|nbbo|dark|flow|candles|history)/
```
For native cutover, the helper targets the NPM bridge gateway IP by default, not `host.docker.internal`. NPM generates `proxy_pass` with a runtime-resolved `$server` variable, so Docker's `/etc/hosts` alias is not sufficient for these proxy hosts. On the current VPS that native target resolves to `172.18.0.1`, which reaches the host-native `3000` and `4000` listeners from the NPM container.
Switching back to Docker restores upstreams to the Compose service names `web:3000` and `api:4000`.
### Rollback helper
```bash
./deployment/native/rollback.sh <git-ref> workers
./deployment/native/rollback.sh <git-ref> services
```
Rollback helper behavior:
- requires a clean repo state
- fetches refs
- switches the checkout to a detached target ref
- reruns `bun install --frozen-lockfile`
- rebuilds the web app only when web scope is included
- restarts the selected user units
- runs the native smoke checks
## Expected unit names
@ -54,87 +192,104 @@ Available overrides:
## systemctl invocation
By default the deploy helper uses:
```bash
sudo -n systemctl
```
If the server uses user units or another wrapper, override it locally before invoking `./deploy`:
For the checked-in user units, use:
```bash
export DEPLOY_NATIVE_SYSTEMCTL_PREFIX="systemctl --user"
./deploy main --runtime native
```
The deploy helper defaults to `sudo -n systemctl`, but that is only appropriate if you intentionally install matching system units.
## Partial native rollouts
Examples:
```bash
./deploy main --runtime native --web-only
./deploy main --runtime native --api-only
./deploy current-branch --runtime native --services-only
./deploy main --runtime native --workers-only
./deploy main --runtime native --fast
./deploy main --runtime native --web-only --no-build
./deploy main --runtime native --services-only
./deploy main --runtime native --web-only
./deploy current-branch --runtime native --workers-only --no-build
```
Scope behavior:
- default: restart web + API + backend services
- default: restart web + API + worker services
- `--web-only`: rebuild/restart only the web unit
- `--api-only`: restart only the API unit
- `--services-only`: restart API + backend units without touching the web unit
- `--fast`: when no explicit scope flag is provided, uses the same `--services-only` scope and trims verbose verification output for quicker completion
- `--services-only`: restart API + worker units without touching the web unit
- `--workers-only`: restart only `compute`, `candles`, `ingest-options`, and `ingest-equities`
- `--fast`: when no explicit scope flag is provided, native deploys now default to `--workers-only`
- `--no-build`: skip `bun install --frozen-lockfile` and skip the web build step
## Current status
## Edge-cutover guardrail
On the current live VPS, native deploys should be treated as opt-in infrastructure work, not the default rollout path. Before a native deploy can succeed there, all of the following must be true at the same time:
- Bun is installed on the host.
- The selected `systemctl` command works non-interactively.
- Islandflow systemd units exist for the requested scope.
- Host-native services can reach the intended NATS, ClickHouse, and Redis endpoints.
- If `web` or `api` move native, the reverse proxy topology is updated deliberately.
Until that is prepared intentionally, prefer:
Native deploys that touch the public web or API edge are intentionally blocked unless you acknowledge cutover readiness:
```bash
./deploy main --runtime docker
./deploy current-branch --runtime docker
export DEPLOY_NATIVE_EDGE_READY=1
```
## Server preparation checklist
Without that variable, these commands are refused:
Before the first native rollout, ensure the VPS has:
- `./deploy main --runtime native`
- `./deploy main --runtime native --web-only`
- `./deploy main --runtime native --api-only`
- `./deploy main --runtime native --services-only`
1. Bun installed and on `PATH`
2. a working `/home/delta/islandflow/.env` (or unit-managed equivalent env source)
3. systemd units for each target service
4. the web unit configured to serve the built app on port `3000`
5. the API unit configured to serve health checks on port `4000`
6. infrastructure endpoints configured so the native services can reach NATS, ClickHouse, and Redis
This keeps native app ownership explicit until infra, app health, and proxy routing are switched deliberately.
## Verification
## Running deploy from the VPS itself
Native deploys verify:
If you run `./deploy` from `/home/delta/islandflow` on the live server, the deploy helper now executes the remote steps locally instead of SSHing back into the same machine.
- target units are active via `systemctl`
- recent unit status and journal output can be collected
- local `http://127.0.0.1:4000/health` when API scope is included
- local `http://127.0.0.1:3000/` when web scope is included
- the public app URL from the local machine after the rollout finishes
That means:
## Rollback
- no SSH key is required for on-server deploy execution
- timing and verification behavior stay the same
- you can still force SSH with `DEPLOY_FORCE_SSH=1`
- you can override the SSH key path with `DEPLOY_SSH_KEY_PATH=/path/to/key`
Rollback remains manual for now:
## Validation matrix
1. switch the server checkout back to the last known-good branch or commit
2. rerun the appropriate native deploy command
3. if needed, restart only the affected units with `systemctl`
| Area | Native workers-only | Native edge cutover |
| --- | --- | --- |
| Bun installed | required | required |
| `systemctl --user` works | required | required |
| Islandflow user units installed | worker units only | all units |
| Host access to NATS/ClickHouse/Redis | required | required |
| Proxy routes updated for `/prints`, `/history`, `/replay`, `/nbbo`, `/ws`, `/flow`, `/candles` | not required | required |
| Public app check | not required | required |
| Public API route suite | not required | required |
Docker remains the fallback and currently recommended runtime during the transition:
## Staged cutover plan
1. **Stage 1: native workers only**
- install user units
- validate `./deployment/native/check-native-health.sh workers`
- use `./deploy main --runtime native --fast`
2. **Stage 2: native API behind local-only verification**
- start `islandflow-api.service`
- confirm `curl http://127.0.0.1:4000/health`
- do not switch public routing yet
3. **Stage 3: deliberate public edge cutover**
- update proxy routing to native `web`/`api`
- export `DEPLOY_NATIVE_EDGE_READY=1`
- run full native deploy
- validate `bun run scripts/check-public-api-routes.ts https://flow.deltaisland.io`
4. **Stage 4: decide final default runtime**
- keep Docker as fallback until native edge has proven stable
## Recommended current commands
Fast backend iteration before edge cutover:
```bash
export DEPLOY_NATIVE_SYSTEMCTL_PREFIX="systemctl --user"
./deploy main --runtime native --fast
```
Supported production path today:
```bash
./deploy main --runtime docker

View file

@ -0,0 +1,24 @@
#!/usr/bin/env bash
set -euo pipefail
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
if [[ "${EUID}" -eq 0 ]]; then
"$repo_root/deployment/native/install-infra-units.sh"
else
sudo "$repo_root/deployment/native/install-infra-units.sh"
fi
echo "Stopping Docker Islandflow services before native infra opens durable data."
(
cd "$repo_root/deployment/docker"
docker compose stop web api compute candles ingest-options ingest-equities nats redis clickhouse
)
if [[ "${EUID}" -eq 0 ]]; then
"$repo_root/deployment/native/start-infra.sh"
else
sudo "$repo_root/deployment/native/start-infra.sh"
fi
"$repo_root/deployment/native/check-native-infra.sh"

View file

@ -0,0 +1,50 @@
#!/usr/bin/env bash
set -euo pipefail
scope="${1:-full}"
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
units=()
case "$scope" in
full)
units=(islandflow-web.service islandflow-api.service islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service)
;;
web)
units=(islandflow-web.service)
;;
api)
units=(islandflow-api.service)
;;
services)
units=(islandflow-api.service islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service)
;;
workers)
units=(islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service)
;;
*)
echo "Unknown scope: $scope" >&2
echo "Expected one of: full, web, api, services, workers" >&2
exit 1
;;
esac
case "$scope" in
full|api|services|workers)
"$repo_root/deployment/native/check-native-infra.sh"
;;
esac
for unit in "${units[@]}"; do
systemctl --user is-active --quiet "$unit"
echo "ok $unit"
done
if [[ " ${units[*]} " == *" islandflow-api.service "* ]]; then
curl -fksS http://127.0.0.1:4000/health >/dev/null
echo "ok api-health"
fi
if [[ " ${units[*]} " == *" islandflow-web.service "* ]]; then
curl -I -fksS http://127.0.0.1:3000/ >/dev/null
echo "ok web-health"
fi

View file

@ -0,0 +1,24 @@
#!/usr/bin/env bash
set -euo pipefail
systemctl is-active --quiet islandflow-nats.service
echo "ok islandflow-nats.service"
systemctl is-active --quiet islandflow-redis.service
echo "ok islandflow-redis.service"
systemctl is-active --quiet islandflow-clickhouse.service
echo "ok islandflow-clickhouse.service"
if command -v redis-cli >/dev/null 2>&1; then
redis-cli -h 127.0.0.1 -p 6379 ping | grep -q PONG
else
timeout 2 bash -c '</dev/tcp/127.0.0.1/6379'
fi
echo "ok redis-ping"
curl -fksS http://127.0.0.1:8123/ping | grep -q Ok
echo "ok clickhouse-ping"
timeout 2 bash -c '</dev/tcp/127.0.0.1/4222'
echo "ok nats-port"

View file

@ -0,0 +1,6 @@
<clickhouse>
<listen_host>127.0.0.1</listen_host>
<path>/var/lib/islandflow/clickhouse/</path>
<tmp_path>/var/lib/islandflow/clickhouse/tmp/</tmp_path>
<user_files_path>/var/lib/islandflow/clickhouse/user_files/</user_files_path>
</clickhouse>

View file

@ -0,0 +1,10 @@
bind 127.0.0.1
protected-mode yes
port 6379
dir /var/lib/islandflow/redis
appendonly yes
save 900 1
save 300 10
save 60 10000
loglevel notice
databases 16

34
deployment/native/cutover.sh Executable file
View file

@ -0,0 +1,34 @@
#!/usr/bin/env bash
set -euo pipefail
scope="${1:-full}"
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
case "$scope" in
full|services|workers|api|web)
;;
*)
echo "Usage: deployment/native/cutover.sh [full|services|workers|api|web]" >&2
exit 1
;;
esac
echo "Stopping Docker-owned Islandflow app services before native ownership starts."
(
cd "$repo_root/deployment/docker"
docker compose stop web api compute candles ingest-options ingest-equities
)
if [[ "$scope" == "full" || "$scope" == "services" || "$scope" == "api" || "$scope" == "web" ]]; then
"$repo_root/deployment/native/check-native-infra.sh"
fi
systemctl --user restart $(case "$scope" in
full) echo islandflow-web.service islandflow-api.service islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service ;;
services) echo islandflow-api.service islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service ;;
workers) echo islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service ;;
api) echo islandflow-api.service ;;
web) echo islandflow-web.service ;;
esac)
"$repo_root/deployment/native/check-native-health.sh" "$scope"

View file

@ -0,0 +1,27 @@
#!/usr/bin/env bash
set -euo pipefail
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
echo "Stopping native app services."
systemctl --user stop islandflow-web.service islandflow-api.service islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service || true
echo "Stopping native infra before Docker reopens durable data."
if [[ "${EUID}" -eq 0 ]]; then
systemctl stop islandflow-nats.service islandflow-redis.service islandflow-clickhouse.service || true
else
sudo systemctl stop islandflow-nats.service islandflow-redis.service islandflow-clickhouse.service || true
fi
echo "Switching NPM Islandflow upstreams back to Docker service names."
"$repo_root/deployment/native/switch-npm-edge.sh" docker
echo "Restarting Docker Islandflow runtime."
(
cd "$repo_root/deployment/docker"
docker compose up -d web api compute candles ingest-options ingest-equities
)
curl -I -fksS "${DEPLOY_PUBLIC_APP_URL:-https://flow.deltaisland.io}" >/dev/null
curl -fksS "${DEPLOY_PUBLIC_API_HEALTH_URL:-https://api.flow.deltaisland.io/health}" >/dev/null
echo "Rollback validation passed."

View file

@ -0,0 +1,72 @@
#!/usr/bin/env bash
set -euo pipefail
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
system_unit_source_dir="$repo_root/deployment/native/systemd/system"
config_source_dir="$repo_root/deployment/native/config"
if [[ "${EUID}" -ne 0 ]]; then
echo "Run as root: sudo $0" >&2
exit 1
fi
resolve_binary() {
local name="$1"
local path=""
path="$(command -v "$name" 2>/dev/null || true)"
if [[ -n "$path" ]]; then
printf '%s\n' "$path"
return 0
fi
for candidate in "/usr/bin/$name" "/usr/sbin/$name" "/usr/local/bin/$name" "/usr/local/sbin/$name"; do
if [[ -x "$candidate" ]]; then
printf '%s\n' "$candidate"
return 0
fi
done
return 1
}
missing=()
for command in nats-server redis-server clickhouse-server; do
if ! resolve_binary "$command" >/dev/null; then
missing+=("$command")
fi
done
if [[ ${#missing[@]} -gt 0 ]]; then
echo "Missing native infra binaries: ${missing[*]}" >&2
echo "Install NATS Server, Redis Server, and ClickHouse Server before bootstrapping native infra." >&2
echo "On Debian, Redis is usually available as redis-server; ClickHouse and NATS may require their vendor repositories or packaged binaries." >&2
exit 1
fi
ensure_system_user() {
local name="$1"
local home="$2"
getent group "$name" >/dev/null || groupadd --system "$name"
getent passwd "$name" >/dev/null || useradd --system --gid "$name" --home-dir "$home" --shell /usr/sbin/nologin "$name"
}
ensure_system_user nats /var/lib/islandflow/nats
ensure_system_user redis /var/lib/islandflow/redis
ensure_system_user clickhouse /var/lib/islandflow/clickhouse
install -d -m 0755 /etc/islandflow
install -m 0644 "$config_source_dir/redis.conf" /etc/islandflow/redis.conf
install -d -m 0755 /etc/clickhouse-server/config.d
install -m 0644 "$config_source_dir/clickhouse-listen.xml" /etc/clickhouse-server/config.d/islandflow-listen.xml
install -d -o nats -g nats -m 0750 /var/lib/islandflow/nats
install -d -o redis -g redis -m 0750 /var/lib/islandflow/redis
install -d -o clickhouse -g clickhouse -m 0750 /var/lib/islandflow/clickhouse
install -m 0644 "$system_unit_source_dir"/islandflow-*.service /etc/systemd/system/
systemctl daemon-reload
echo "Installed native infra system units and config."
echo "Start infra with: sudo deployment/native/start-infra.sh"

View file

@ -0,0 +1,49 @@
#!/usr/bin/env bash
set -euo pipefail
scope="${1:-none}"
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
unit_source_dir="$repo_root/deployment/native/systemd/user"
unit_target_dir="${XDG_CONFIG_HOME:-$HOME/.config}/systemd/user"
units=()
case "$scope" in
none)
;;
full)
units=(islandflow-web.service islandflow-api.service islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service)
;;
web)
units=(islandflow-web.service)
;;
api)
units=(islandflow-api.service)
;;
services)
units=(islandflow-api.service islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service)
;;
workers)
units=(islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service)
;;
*)
echo "Unknown scope: $scope" >&2
echo "Expected one of: none, full, web, api, services, workers" >&2
exit 1
;;
esac
mkdir -p "$unit_target_dir"
cp "$unit_source_dir"/*.service "$unit_target_dir"/
systemctl --user daemon-reload
if [[ ${#units[@]} -gt 0 ]]; then
systemctl --user enable "${units[@]}"
fi
echo "Installed Islandflow user units into $unit_target_dir"
if [[ ${#units[@]} -gt 0 ]]; then
echo "Enabled scope: $scope"
else
echo "No units enabled yet. Pass a scope such as workers when you are ready."
fi

57
deployment/native/rollback.sh Executable file
View file

@ -0,0 +1,57 @@
#!/usr/bin/env bash
set -euo pipefail
if [[ $# -lt 1 || $# -gt 2 ]]; then
echo "Usage: deployment/native/rollback.sh <git-ref> [full|web|api|services|workers]" >&2
exit 1
fi
ref="$1"
scope="${2:-services}"
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
cd "$repo_root"
if [[ -n "$(git status --porcelain=v1)" ]]; then
echo "Refusing rollback with a dirty working tree." >&2
exit 1
fi
current_ref="$(git rev-parse --short HEAD)"
echo "Rolling back from $current_ref to $ref (scope: $scope)"
git fetch --all --prune
git switch --detach "$ref"
bun install --frozen-lockfile
if [[ "$scope" == "full" || "$scope" == "web" ]]; then
bun --cwd=apps/web run build
fi
case "$scope" in
full)
units=(islandflow-web.service islandflow-api.service islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service)
;;
web)
units=(islandflow-web.service)
;;
api)
units=(islandflow-api.service)
;;
services)
units=(islandflow-api.service islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service)
;;
workers)
units=(islandflow-compute.service islandflow-candles.service islandflow-ingest-options.service islandflow-ingest-equities.service)
;;
*)
echo "Unknown scope: $scope" >&2
exit 1
;;
esac
systemctl --user restart "${units[@]}"
"$repo_root/deployment/native/check-native-health.sh" "$scope"
echo "Rollback complete. Repo is now detached at $(git rev-parse --short HEAD)."
echo "Return to tracked main later with: git switch main && git pull --ff-only <remote> main"

View file

@ -0,0 +1,17 @@
#!/usr/bin/env bash
set -euo pipefail
if [[ "${EUID}" -ne 0 ]]; then
echo "Run as root: sudo $0" >&2
exit 1
fi
for unit in redis-server.service nats-server.service clickhouse-server.service; do
if systemctl list-unit-files "$unit" >/dev/null 2>&1; then
systemctl disable --now "$unit" >/dev/null 2>&1 || true
fi
done
systemctl reset-failed islandflow-nats.service islandflow-redis.service islandflow-clickhouse.service || true
systemctl enable --now islandflow-nats.service islandflow-redis.service islandflow-clickhouse.service
"$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/check-native-infra.sh"

View file

@ -0,0 +1,9 @@
#!/usr/bin/env bash
set -euo pipefail
if [[ "${EUID}" -ne 0 ]]; then
echo "Run as root: sudo $0" >&2
exit 1
fi
systemctl stop islandflow-nats.service islandflow-redis.service islandflow-clickhouse.service

View file

@ -0,0 +1,285 @@
#!/usr/bin/env bash
set -euo pipefail
target="${1:-native}"
npm_root="${NPM_ROOT:-/home/delta/nginx-proxy-manager}"
db_path="${NPM_DB_PATH:-$npm_root/data/database.sqlite}"
app_domain="${ISLANDFLOW_APP_DOMAIN:-flow.deltaisland.io}"
api_domain="${ISLANDFLOW_API_DOMAIN:-api.flow.deltaisland.io}"
native_host="${ISLANDFLOW_NATIVE_HOST:-}"
docker_web_host="${ISLANDFLOW_DOCKER_WEB_HOST:-web}"
docker_api_host="${ISLANDFLOW_DOCKER_API_HOST:-api}"
web_port="${ISLANDFLOW_WEB_PORT:-3000}"
api_port="${ISLANDFLOW_API_PORT:-4000}"
restart_npm="${NPM_RESTART:-1}"
npm_container="${NPM_CONTAINER_NAME:-nginx-proxy-manager}"
sudo_cmd=()
case "$target" in
native|docker)
;;
*)
echo "Usage: deployment/native/switch-npm-edge.sh [native|docker]" >&2
exit 1
;;
esac
resolve_native_host() {
if [[ -n "$native_host" ]]; then
printf '%s\n' "$native_host"
return
fi
if command -v docker >/dev/null 2>&1 && docker ps --format '{{.Names}}' | grep -qx "$npm_container"; then
native_host="$(docker inspect "$npm_container" --format '{{range .NetworkSettings.Networks}}{{println .Gateway}}{{end}}' | sed '/^$/d' | head -n1)"
if [[ -n "$native_host" ]]; then
printf '%s\n' "$native_host"
return
fi
fi
echo "Unable to determine the native upstream host for NPM." >&2
echo "Set ISLANDFLOW_NATIVE_HOST explicitly or start the $npm_container container first." >&2
exit 1
}
if [[ "$target" == "native" ]]; then
native_host="$(resolve_native_host)"
fi
if [[ ! -w "$db_path" || ! -w "$(dirname "$db_path")" ]]; then
if [[ "${EUID}" -eq 0 ]]; then
sudo_cmd=()
elif command -v sudo >/dev/null 2>&1; then
sudo_cmd=(sudo)
else
echo "NPM database path is not writable and sudo is unavailable: $db_path" >&2
exit 1
fi
fi
if [[ ! -f "$db_path" ]]; then
echo "NPM database not found: $db_path" >&2
exit 1
fi
backup="$db_path.before-islandflow-$target-$(date +%Y%m%d%H%M%S)"
"${sudo_cmd[@]}" cp "$db_path" "$backup"
echo "Backed up NPM database to $backup"
"${sudo_cmd[@]}" python3 - "$db_path" "$target" "$app_domain" "$api_domain" "$native_host" "$docker_web_host" "$docker_api_host" "$web_port" "$api_port" <<'PY'
import json
import sqlite3
import sys
db_path, target, app_domain, api_domain, native_host, docker_web_host, docker_api_host, web_port, api_port = sys.argv[1:]
web_host = native_host if target == "native" else docker_web_host
api_host = native_host if target == "native" else docker_api_host
advanced_config = f"""location ~ ^/(ws|replay|prints|joins|nbbo|dark|flow|candles|history)/ {{
set $forward_scheme http;
set $server "{api_host}";
set $port {api_port};
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $http_connection;
proxy_http_version 1.1;
include conf.d/include/proxy.conf;
}}"""
def has_domain(raw, domain):
try:
return domain in json.loads(raw)
except Exception:
return domain in raw
con = sqlite3.connect(db_path)
cur = con.cursor()
rows = list(cur.execute("select id, domain_names from proxy_host where is_deleted = 0"))
app_ids = [row_id for row_id, domains in rows if has_domain(domains, app_domain)]
api_ids = [row_id for row_id, domains in rows if has_domain(domains, api_domain)]
if len(app_ids) != 1 or len(api_ids) != 1:
raise SystemExit(f"Expected one app and one API proxy host, found app={app_ids} api={api_ids}")
cur.execute(
"update proxy_host set forward_scheme = 'http', forward_host = ?, forward_port = ?, allow_websocket_upgrade = 1, advanced_config = ?, modified_on = datetime('now') where id = ?",
(web_host, int(web_port), advanced_config, app_ids[0]),
)
cur.execute(
"update proxy_host set forward_scheme = 'http', forward_host = ?, forward_port = ?, allow_websocket_upgrade = 1, modified_on = datetime('now') where id = ?",
(api_host, int(api_port), api_ids[0]),
)
con.commit()
print(f"Updated {app_domain} -> {web_host}:{web_port}")
print(f"Updated {api_domain} -> {api_host}:{api_port}")
PY
if command -v python3 >/dev/null 2>&1; then
"${sudo_cmd[@]}" python3 - "$npm_root" "$db_path" "$target" "$app_domain" "$api_domain" "$native_host" "$docker_web_host" "$docker_api_host" "$web_port" "$api_port" <<'PY'
import json
import re
import sqlite3
import sys
from pathlib import Path
(
npm_root,
db_path,
target,
app_domain,
api_domain,
native_host,
docker_web_host,
docker_api_host,
web_port,
api_port,
) = sys.argv[1:]
web_host = native_host if target == "native" else docker_web_host
api_host = native_host if target == "native" else docker_api_host
def has_domain(raw, domain):
try:
return domain in json.loads(raw)
except Exception:
return domain in raw
def replace_nth(text, pattern, replacement, index):
matches = list(pattern.finditer(text))
if len(matches) < index:
raise SystemExit(f"Unable to rewrite generated proxy config; expected match {index} for {pattern.pattern!r}")
match = matches[index - 1]
return text[:match.start()] + replacement(match) + text[match.end():]
server_pattern = re.compile(r'^(?P<prefix>\s*set \$server\s+)".*?";\s*$', re.M)
port_pattern = re.compile(r'^(?P<prefix>\s*set \$port\s+)\d+;\s*$', re.M)
def replace_server(text, host, index):
return replace_nth(text, server_pattern, lambda m: f'{m.group("prefix")}"{host}";', index)
def replace_port(text, port, index):
return replace_nth(text, port_pattern, lambda m: f'{m.group("prefix")}{port};', index)
con = sqlite3.connect(db_path)
rows = list(con.execute("select id, domain_names from proxy_host where is_deleted = 0"))
app_ids = [row_id for row_id, domains in rows if has_domain(domains, app_domain)]
api_ids = [row_id for row_id, domains in rows if has_domain(domains, api_domain)]
if len(app_ids) != 1 or len(api_ids) != 1:
raise SystemExit(f"Expected one app and one API proxy host, found app={app_ids} api={api_ids}")
api_conf = Path(npm_root) / "data/nginx/proxy_host" / f"{api_ids[0]}.conf"
app_conf = Path(npm_root) / "data/nginx/proxy_host" / f"{app_ids[0]}.conf"
if api_conf.exists():
text = api_conf.read_text()
text = replace_server(text, api_host, 1)
text = replace_port(text, int(api_port), 1)
api_conf.write_text(text)
print(f"Synchronized {api_conf.name} -> {api_host}:{api_port}")
if app_conf.exists():
text = app_conf.read_text()
text = replace_server(text, web_host, 1)
text = replace_port(text, int(web_port), 1)
text = replace_server(text, api_host, 2)
text = replace_port(text, int(api_port), 2)
app_conf.write_text(text)
print(f"Synchronized {app_conf.name} -> {web_host}:{web_port} and API matcher -> {api_host}:{api_port}")
PY
fi
if [[ "$restart_npm" == "0" ]]; then
echo "NPM container restart skipped because NPM_RESTART=0."
elif command -v docker >/dev/null 2>&1 && docker ps --format '{{.Names}}' | grep -qx nginx-proxy-manager; then
docker restart nginx-proxy-manager >/dev/null
echo "Restarted nginx-proxy-manager"
else
echo "NPM container restart skipped; restart it manually if it is not managed by Docker on this host."
fi
if command -v docker >/dev/null 2>&1 && docker ps --format '{{.Names}}' | grep -qx "$npm_container"; then
"${sudo_cmd[@]}" python3 - "$npm_root" "$db_path" "$target" "$app_domain" "$api_domain" "$native_host" "$docker_web_host" "$docker_api_host" "$web_port" "$api_port" <<'PY'
import json
import re
import sqlite3
import sys
from pathlib import Path
(
npm_root,
db_path,
target,
app_domain,
api_domain,
native_host,
docker_web_host,
docker_api_host,
web_port,
api_port,
) = sys.argv[1:]
web_host = native_host if target == "native" else docker_web_host
api_host = native_host if target == "native" else docker_api_host
def has_domain(raw, domain):
try:
return domain in json.loads(raw)
except Exception:
return domain in raw
def replace_nth(text, pattern, replacement, index):
matches = list(pattern.finditer(text))
if len(matches) < index:
raise SystemExit(f"Unable to rewrite generated proxy config; expected match {index} for {pattern.pattern!r}")
match = matches[index - 1]
return text[:match.start()] + replacement(match) + text[match.end():]
server_pattern = re.compile(r'^(?P<prefix>\s*set \$server\s+)".*?";\s*$', re.M)
port_pattern = re.compile(r'^(?P<prefix>\s*set \$port\s+)\d+;\s*$', re.M)
def replace_server(text, host, index):
return replace_nth(text, server_pattern, lambda m: f'{m.group("prefix")}"{host}";', index)
def replace_port(text, port, index):
return replace_nth(text, port_pattern, lambda m: f'{m.group("prefix")}{port};', index)
con = sqlite3.connect(db_path)
rows = list(con.execute("select id, domain_names from proxy_host where is_deleted = 0"))
app_ids = [row_id for row_id, domains in rows if has_domain(domains, app_domain)]
api_ids = [row_id for row_id, domains in rows if has_domain(domains, api_domain)]
if len(app_ids) != 1 or len(api_ids) != 1:
raise SystemExit(f"Expected one app and one API proxy host, found app={app_ids} api={api_ids}")
api_conf = Path(npm_root) / "data/nginx/proxy_host" / f"{api_ids[0]}.conf"
app_conf = Path(npm_root) / "data/nginx/proxy_host" / f"{app_ids[0]}.conf"
if api_conf.exists():
text = api_conf.read_text()
text = replace_server(text, api_host, 1)
text = replace_port(text, int(api_port), 1)
api_conf.write_text(text)
if app_conf.exists():
text = app_conf.read_text()
text = replace_server(text, web_host, 1)
text = replace_port(text, int(web_port), 1)
text = replace_server(text, api_host, 2)
text = replace_port(text, int(api_port), 2)
app_conf.write_text(text)
PY
reloaded=0
for _ in 1 2 3 4 5; do
if docker exec "$npm_container" nginx -s reload >/dev/null 2>&1; then
reloaded=1
break
fi
sleep 1
done
if [[ "$reloaded" == "1" ]]; then
echo "Reloaded nginx-proxy-manager"
else
echo "Warning: nginx-proxy-manager reload did not succeed after restart; verify the container is healthy." >&2
fi
fi

View file

@ -0,0 +1,17 @@
[Unit]
Description=Islandflow ClickHouse
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
ExecStart=/usr/bin/env clickhouse-server --config-file=/etc/clickhouse-server/config.xml
Restart=always
RestartSec=5
User=clickhouse
Group=clickhouse
StateDirectory=clickhouse
LimitNOFILE=262144
[Install]
WantedBy=multi-user.target

View file

@ -0,0 +1,18 @@
[Unit]
Description=Islandflow NATS JetStream
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
ExecStart=/usr/sbin/nats-server -js -sd /var/lib/islandflow/nats -a 127.0.0.1 -p 4222 -m 8222
Restart=always
RestartSec=2
User=nats
Group=nats
RuntimeDirectory=islandflow-nats
StateDirectory=islandflow/nats
LimitNOFILE=1048576
[Install]
WantedBy=multi-user.target

View file

@ -0,0 +1,18 @@
[Unit]
Description=Islandflow Redis
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/bin/env redis-server /etc/islandflow/redis.conf --supervised systemd --daemonize no
Restart=always
RestartSec=2
User=redis
Group=redis
RuntimeDirectory=islandflow-redis
StateDirectory=islandflow/redis
LimitNOFILE=65535
[Install]
WantedBy=multi-user.target

View file

@ -0,0 +1,19 @@
[Unit]
Description=Islandflow API
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=/home/delta/islandflow
Environment=API_HOST=0.0.0.0
Environment=API_PORT=4000
EnvironmentFile=/home/delta/islandflow/.env
ExecStart=/home/delta/.bun/bin/bun services/api/src/index.ts
Restart=always
RestartSec=2
KillSignal=SIGINT
TimeoutStopSec=20
[Install]
WantedBy=default.target

View file

@ -0,0 +1,17 @@
[Unit]
Description=Islandflow candles
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=/home/delta/islandflow
EnvironmentFile=/home/delta/islandflow/.env
ExecStart=/home/delta/.bun/bin/bun services/candles/src/index.ts
Restart=always
RestartSec=2
KillSignal=SIGINT
TimeoutStopSec=20
[Install]
WantedBy=default.target

View file

@ -0,0 +1,17 @@
[Unit]
Description=Islandflow compute
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=/home/delta/islandflow
EnvironmentFile=/home/delta/islandflow/.env
ExecStart=/home/delta/.bun/bin/bun services/compute/src/index.ts
Restart=always
RestartSec=2
KillSignal=SIGINT
TimeoutStopSec=20
[Install]
WantedBy=default.target

View file

@ -0,0 +1,17 @@
[Unit]
Description=Islandflow ingest-equities
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=/home/delta/islandflow
EnvironmentFile=/home/delta/islandflow/.env
ExecStart=/home/delta/.bun/bin/bun services/ingest-equities/src/index.ts
Restart=always
RestartSec=2
KillSignal=SIGINT
TimeoutStopSec=20
[Install]
WantedBy=default.target

View file

@ -0,0 +1,18 @@
[Unit]
Description=Islandflow ingest-options
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=/home/delta/islandflow
EnvironmentFile=/home/delta/islandflow/.env
Environment=OPTIONS_INGEST_ADAPTER=synthetic
ExecStart=/home/delta/.bun/bin/bun services/ingest-options/src/index.ts
Restart=always
RestartSec=2
KillSignal=SIGINT
TimeoutStopSec=20
[Install]
WantedBy=default.target

View file

@ -0,0 +1,19 @@
[Unit]
Description=Islandflow web
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=/home/delta/islandflow
Environment=WEB_HOST=0.0.0.0
Environment=WEB_PORT=3000
EnvironmentFile=/home/delta/islandflow/.env
ExecStart=/bin/sh -lc 'cd /home/delta/islandflow/apps/web && exec /home/delta/.bun/bin/bun x next start -H "$WEB_HOST" -p "$WEB_PORT"'
Restart=always
RestartSec=2
KillSignal=SIGINT
TimeoutStopSec=20
[Install]
WantedBy=default.target