islandflow/README.md
2026-01-09 16:11:38 -05:00

13 KiB

Real-Time Options Flow & Off-Exchange Analysis

This repository contains a real-time market-flow analysis platform focused on options flow, off-exchange equity trades, and inferred institutional behavior, built for low-latency, explainable analysis rather than black-box signals.

The system ingests real-time options trades/quotes and equity prints, clusters raw activity into higher-level flow events (sweeps, spreads, rolls, ladders), applies rule-first classifiers, and visualizes the results through a high-performance, TradingView-smooth interface with full replay and backtesting support.

CURRENT STATE (Plan Progress)

Plan progress (rough): [#####-----]

Done now (in repo):

  • Bun monorepo + infra docker compose (ClickHouse, Redis, NATS JetStream)
  • Shared event schemas + logging + config helpers
  • Synthetic options/equity prints (full S&P 500) published to NATS and persisted to ClickHouse
  • Deterministic option FlowPacket clustering (time window) + persistence
  • Rolling stats in Redis (premium/size/spread) with z-score features on FlowPackets
  • FlowPacket structure tags (vertical/ladder/straddle/strangle) for multi-leg bursts
  • Aggressor mix features (NBBO placement ratios) on FlowPackets
  • Rule-first classifiers + alert scoring with ClickHouse persistence + WS/REST endpoints
  • API: REST for prints/flow packets/classifier hits/alerts, WS for live options/equities/flow/alerts/hits, replay endpoints
  • UI: live tapes for options/equities/flow + replay toggle + pause controls + replay time/completion
  • UI: alerts + classifier hits panels, ticker filter, evidence drawer, severity strip
  • Databento historical replay adapter (options) with symbol mapping
  • Alpaca options adapter (dev-only, bounded contract list)
  • IBKR options adapter (single-underlying bridge via ib_insync)
  • Dark-pool-style inference (absorbed blocks, stealth accumulation, distribution) with WS/REST surfaces and UI list
  • Testing-mode throttling for ingest to reduce CPU during local dev

In progress / blocked:

  • Production-grade licensed live data feeds (beyond current dev/test bridges)
  • Advanced clustering (spreads/rolls beyond basic structure tags)
  • Candles/overlays service (scaffolded, not yet emitting data)

Not started:

  • Reference data/corporate action enrichment
  • Auth / secure deployment

Core Principles

  • Explainability first — every alert and signal is backed by observable data and explicit logic.
  • Event-sourced architecture — all raw and derived events are persisted and replayable.
  • Market microstructure correctness — conservative handling of aggressor inference, OI, and off-exchange prints.
  • Low-latency, tangible UX — smooth real-time interaction that feels like an instrument panel, not a spreadsheet.

Current Capabilities

  • Synthetic options/equity prints with deterministic sequencing across the S&P 500
  • Ingest adapter seam (env-selected; options default synthetic, equities default synthetic)
  • Raw event persistence in ClickHouse + streaming via NATS JetStream
  • Deterministic option FlowPacket clustering (time-window)
  • Rolling stats baselines in Redis with z-score features on FlowPackets
  • Basic multi-leg structure tagging on FlowPackets
  • Aggressor mix features from NBBO placement on FlowPackets
  • Classifiers + alert scoring (rule-first) with WS/REST endpoints
  • API gateway with REST, WS, and replay endpoints
  • UI tapes for options/equities/flow packets + alerts/hits with live/replay toggle and pause controls
  • Alpaca options adapter (dev-only) with bounded contract selection
  • IBKR options adapter (single-underlying bridge via Python sidecar)
  • Databento historical replay adapter (options, Python sidecar)
  • Dark-pool-style inference (absorbed blocks, stealth accumulation, distribution) with evidence links and replay

Planned Capabilities (from PLAN.md)

  • Real-time licensed market data ingestors (options + equities)
  • Candle aggregation + chart overlays
  • Replay/backtesting metrics and calibration
  • Reference data, symbology, and corporate-action handling

Tech Stack

  • Runtime & tooling: Bun
  • Language: TypeScript
  • Frontend: Next.js + React
  • Realtime: WebSockets
  • Event streaming: NATS JetStream or Redpanda
  • Storage: ClickHouse, Redis
  • Charting: TradingView Lightweight Charts + custom canvas/WebGL overlays

Repository Structure

apps/ web/ services/ ingest-options/ ingest-equities/ compute/ api/ packages/ types/ ui/ chart/

Build and Run

Install dependencies:

  • bun install

Start infra:

  • docker compose up -d

Create env file:

  • Copy .env.example to .env and fill in the API keys you plan to use.

Start everything (infra + services + web):

  • bun run dev

Run just the web app (fixed to port 3000):

  • bun run dev:web

Run just the API:

  • bun --cwd services/api run dev

Environment Configuration

All runtime configuration is driven by .env. Start by copying .env.example and edit the values you need. Defaults below match .env.example unless otherwise noted.

Core infrastructure

These define how services connect to the event bus and storage backends. Documentation links are provided for convenience.

  • NATS_URL (default nats://localhost:4222) — NATS JetStream endpoint. See NATS and JetStream.
  • CLICKHOUSE_URL (default http://localhost:8123) — ClickHouse HTTP endpoint. See ClickHouse.
  • CLICKHOUSE_DATABASE (default default) — ClickHouse database name.
  • REDIS_URL (default redis://localhost:6379) — Redis endpoint for rolling stats. See Redis.

Adapter selection

  • OPTIONS_INGEST_ADAPTER (default synthetic) — options ingest adapter: synthetic, alpaca, ibkr, databento.
  • EQUITIES_INGEST_ADAPTER (default synthetic) — equities ingest adapter.
  • EMIT_INTERVAL_MS (default 1000) — synthetic equities emit cadence.

Alpaca options adapter (dev-only)

Provider links: Alpaca, Alpaca Market Data API.

  • ALPACA_KEY_ID, ALPACA_SECRET_KEY — credentials.
  • ALPACA_REST_URL (default https://data.alpaca.markets) — REST endpoint.
  • ALPACA_WS_BASE_URL (default wss://stream.data.alpaca.markets/v1beta1) — streaming endpoint.
  • ALPACA_FEED (default indicative) — use opra when you have a subscription.
  • ALPACA_UNDERLYINGS (default SPY,NVDA,AAPL) — comma-separated list of symbols.
  • ALPACA_STRIKES_PER_SIDE (default 8) — strikes per side around ATM.
  • ALPACA_MAX_DTE_DAYS (default 30) — expiry horizon.
  • ALPACA_MONEYNESS_PCT (default 0.06) — ATM band for strike selection.
  • ALPACA_MONEYNESS_FALLBACK_PCT (default 0.1) — fallback band if strikes are sparse.
  • ALPACA_MAX_QUOTES (default 200) — subscription size guardrail.

Databento historical replay adapter

Provider links: Databento, Databento API.

  • DATABENTO_API_KEY — API key.
  • DATABENTO_DATASET (default OPRA.PILLAR) — dataset.
  • DATABENTO_SCHEMA (default trades) — schema.
  • DATABENTO_START — ISO date/time start for replay.
  • DATABENTO_END — ISO date/time end (optional).
  • DATABENTO_SYMBOLS (default SPY.OPT) — comma list or dataset symbols.
  • DATABENTO_STYPE_IN (default parent) — input symbology type.
  • DATABENTO_STYPE_OUT (default instrument_id) — output symbology type.
  • DATABENTO_LIMIT (default 0) — record cap (0 means no cap).
  • DATABENTO_PRICE_SCALE (default 1) — divide raw price by this value.
  • DATABENTO_PYTHON_BIN (default py/.venv/bin/python) — Python executable for replay sidecar.

IBKR options adapter (Python sidecar)

Provider links: Interactive Brokers, IBKR API docs.

  • IBKR_HOST (default 127.0.0.1) — TWS/Gateway host.
  • IBKR_PORT (default 7497) — TWS/Gateway port.
  • IBKR_CLIENT_ID (default 0) — API client ID.
  • IBKR_SYMBOL (default SPY) — underlying symbol.
  • IBKR_EXPIRY (default 20250117) — expiry in YYYYMMDD.
  • IBKR_STRIKE (default 450) — strike price.
  • IBKR_RIGHT (default C) — option right (C or P).
  • IBKR_EXCHANGE (default SMART) — exchange route.
  • IBKR_CURRENCY (default USD) — currency.
  • IBKR_PYTHON_BIN (default python3) — Python executable for sidecar.

Compute + market-structure tuning

  • COMPUTE_DELIVER_POLICY (default new) — consumer start behavior (new or all).
  • COMPUTE_CONSUMER_RESET (default false) — force consumer reset (skip backlog).
  • NBBO_MAX_AGE_MS (default 1000) — max allowed NBBO age for joins.
  • NEXT_PUBLIC_NBBO_MAX_AGE_MS (default 1000) — UI-visible NBBO age for display gating.
  • ROLLING_WINDOW_SIZE (default 50) — rolling stats window length.
  • ROLLING_TTL_SEC (default 86400) — rolling stats TTL in seconds.

Classifier thresholds

  • CLASSIFIER_SWEEP_MIN_PREMIUM (default 40000) — absolute sweep premium floor.
  • CLASSIFIER_SWEEP_MIN_COUNT (default 3) — minimum leg count for sweeps.
  • CLASSIFIER_SWEEP_MIN_PREMIUM_Z (default 2) — sweep premium z-score threshold.
  • CLASSIFIER_SPIKE_MIN_PREMIUM (default 20000) — absolute spike premium floor.
  • CLASSIFIER_SPIKE_MIN_SIZE (default 400) — absolute spike size floor.
  • CLASSIFIER_SPIKE_MIN_PREMIUM_Z (default 2.5) — spike premium z-score threshold.
  • CLASSIFIER_SPIKE_MIN_SIZE_Z (default 2) — spike size z-score threshold.
  • CLASSIFIER_Z_MIN_SAMPLES (default 12) — minimum samples before z-scores apply.
  • CLASSIFIER_MIN_NBBO_COVERAGE (default 0.5) — NBBO coverage ratio gate.
  • CLASSIFIER_MIN_AGGRESSOR_RATIO (default 0.55) — aggressor ratio gate.
  • CLASSIFIER_0DTE_MAX_ATM_PCT (default 0.01) — max ATM distance as pct of underlying for 0DTE gamma punch.
  • CLASSIFIER_0DTE_MIN_PREMIUM (default 20000) — 0DTE gamma punch premium floor.
  • CLASSIFIER_0DTE_MIN_SIZE (default 400) — 0DTE gamma punch size floor.

Testing + throttling

  • TESTING_MODE (default false) — enable ingest throttling for local dev.
  • TESTING_THROTTLE_MS (default 200) — minimum spacing between emitted prints.

Testing mode (throttles ingest to reduce CPU):

  • TESTING_MODE=true enables throttling
  • TESTING_THROTTLE_MS=200 minimum spacing between emitted prints (per ingest service)

IBKR adapter (options, via Python ib_insync):

  • Install Python deps: python3 -m pip install -r services/ingest-options/py/requirements.txt
  • Set OPTIONS_INGEST_ADAPTER=ibkr and configure:
    • IBKR_HOST, IBKR_PORT, IBKR_CLIENT_ID
    • IBKR_SYMBOL, IBKR_EXPIRY (YYYYMMDD), IBKR_STRIKE, IBKR_RIGHT
    • Optional: IBKR_EXCHANGE (default SMART), IBKR_CURRENCY (default USD), IBKR_PYTHON_BIN

Alpaca adapter (options, dev-only bridge):

  • Set OPTIONS_INGEST_ADAPTER=alpaca and configure:
    • ALPACA_KEY_ID, ALPACA_SECRET_KEY
    • ALPACA_UNDERLYINGS (comma-separated, default SPY,NVDA,AAPL)
    • Optional: ALPACA_FEED (indicative default, opra with subscription)
    • Optional: ALPACA_MAX_QUOTES (default 200), ALPACA_REST_URL, ALPACA_WS_BASE_URL
    • Optional selection tuning: ALPACA_STRIKES_PER_SIDE (default 8), ALPACA_MAX_DTE_DAYS (default 30), ALPACA_MONEYNESS_PCT (default 0.06), ALPACA_MONEYNESS_FALLBACK_PCT (default 0.10)

Alpaca selection policy (dev-only, deterministic):

  • Pick nearest weekly and nearest monthly expiries within 30 DTE (fallback to earliest expiries if missing)
  • For each expiry, select 8 strikes per side closest to ATM within ±6% (fallback to ±10% if needed)
  • Subscriptions are built once at startup to keep the stream bounded and repeatable

Databento historical replay adapter (options, via Python databento):

  • Install Python deps: python3 -m pip install -r services/ingest-options/py/requirements.txt
  • Set OPTIONS_INGEST_ADAPTER=databento and configure:
    • DATABENTO_API_KEY, DATABENTO_START (ISO date/time)
    • Optional: DATABENTO_END, DATABENTO_DATASET (default OPRA.PILLAR), DATABENTO_SCHEMA (default trades)
    • Optional: DATABENTO_SYMBOLS (ALL or comma list), DATABENTO_STYPE_IN/DATABENTO_STYPE_OUT (default raw_symbol)
    • Optional: DATABENTO_LIMIT (record cap), DATABENTO_PRICE_SCALE (divide raw price), DATABENTO_PYTHON_BIN
  • This adapter replays historical data only; live capture will be added later.

Run tests:

  • bun test

Status

Active build for personal, non-delayed analytical use. Multi-user access and redistribution are intentionally out of scope.

Non-Goals

  • No black-box AI predictions
  • No profit guarantees
  • No real-time data redistribution
  • No guessing at intent without evidence

License / Usage

For research and personal analytical use.
Market data usage is subject to the terms of the data providers.