13 KiB
Real-Time Options Flow & Off-Exchange Analysis
This repository contains a real-time market-flow analysis platform focused on options flow, off-exchange equity trades, and inferred institutional behavior, built for low-latency, explainable analysis rather than black-box signals.
The system ingests real-time options trades/quotes and equity prints, clusters raw activity into higher-level flow events (sweeps, spreads, rolls, ladders), applies rule-first classifiers, and visualizes the results through a high-performance, TradingView-smooth interface with full replay and backtesting support.
CURRENT STATE (Plan Progress)
Plan progress (rough): [#####-----]
Done now (in repo):
- Bun monorepo + infra docker compose (ClickHouse, Redis, NATS JetStream)
- Shared event schemas + logging + config helpers
- Synthetic options/equity prints (full S&P 500) published to NATS and persisted to ClickHouse
- Deterministic option FlowPacket clustering (time window) + persistence
- Rolling stats in Redis (premium/size/spread) with z-score features on FlowPackets
- FlowPacket structure tags (vertical/ladder/straddle/strangle) for multi-leg bursts
- Aggressor mix features (NBBO placement ratios) on FlowPackets
- Rule-first classifiers + alert scoring with ClickHouse persistence + WS/REST endpoints
- API: REST for prints/flow packets/classifier hits/alerts, WS for live options/equities/flow/alerts/hits, replay endpoints
- UI: live tapes for options/equities/flow + replay toggle + pause controls + replay time/completion
- UI: alerts + classifier hits panels, ticker filter, evidence drawer, severity strip
- Databento historical replay adapter (options) with symbol mapping
- Alpaca options adapter (dev-only, bounded contract list)
- IBKR options adapter (single-underlying bridge via
ib_insync) - Dark-pool-style inference (absorbed blocks, stealth accumulation, distribution) with WS/REST surfaces and UI list
- Testing-mode throttling for ingest to reduce CPU during local dev
In progress / blocked:
- Production-grade licensed live data feeds (beyond current dev/test bridges)
- Advanced clustering (spreads/rolls beyond basic structure tags)
- Candles/overlays service (scaffolded, not yet emitting data)
Not started:
- Reference data/corporate action enrichment
- Auth / secure deployment
Core Principles
- Explainability first — every alert and signal is backed by observable data and explicit logic.
- Event-sourced architecture — all raw and derived events are persisted and replayable.
- Market microstructure correctness — conservative handling of aggressor inference, OI, and off-exchange prints.
- Low-latency, tangible UX — smooth real-time interaction that feels like an instrument panel, not a spreadsheet.
Current Capabilities
- Synthetic options/equity prints with deterministic sequencing across the S&P 500
- Ingest adapter seam (env-selected; options default
synthetic, equities defaultsynthetic) - Raw event persistence in ClickHouse + streaming via NATS JetStream
- Deterministic option FlowPacket clustering (time-window)
- Rolling stats baselines in Redis with z-score features on FlowPackets
- Basic multi-leg structure tagging on FlowPackets
- Aggressor mix features from NBBO placement on FlowPackets
- Classifiers + alert scoring (rule-first) with WS/REST endpoints
- API gateway with REST, WS, and replay endpoints
- UI tapes for options/equities/flow packets + alerts/hits with live/replay toggle and pause controls
- Alpaca options adapter (dev-only) with bounded contract selection
- IBKR options adapter (single-underlying bridge via Python sidecar)
- Databento historical replay adapter (options, Python sidecar)
- Dark-pool-style inference (absorbed blocks, stealth accumulation, distribution) with evidence links and replay
Planned Capabilities (from PLAN.md)
- Real-time licensed market data ingestors (options + equities)
- Candle aggregation + chart overlays
- Replay/backtesting metrics and calibration
- Reference data, symbology, and corporate-action handling
Tech Stack
- Runtime & tooling: Bun
- Language: TypeScript
- Frontend: Next.js + React
- Realtime: WebSockets
- Event streaming: NATS JetStream or Redpanda
- Storage: ClickHouse, Redis
- Charting: TradingView Lightweight Charts + custom canvas/WebGL overlays
Repository Structure
apps/ web/ services/ ingest-options/ ingest-equities/ compute/ api/ packages/ types/ ui/ chart/
Build and Run
Install dependencies:
bun install
Start infra:
docker compose up -d
Create env file:
- Copy
.env.exampleto.envand fill in the API keys you plan to use.
Start everything (infra + services + web):
bun run dev
Run just the web app (fixed to port 3000):
bun run dev:web
Run just the API:
bun --cwd services/api run dev
Environment Configuration
All runtime configuration is driven by .env. Start by copying .env.example and edit the values you need. Defaults below match .env.example unless otherwise noted.
Core infrastructure
These define how services connect to the event bus and storage backends. Documentation links are provided for convenience.
NATS_URL(defaultnats://localhost:4222) — NATS JetStream endpoint. See NATS and JetStream.CLICKHOUSE_URL(defaulthttp://localhost:8123) — ClickHouse HTTP endpoint. See ClickHouse.CLICKHOUSE_DATABASE(defaultdefault) — ClickHouse database name.REDIS_URL(defaultredis://localhost:6379) — Redis endpoint for rolling stats. See Redis.
Adapter selection
OPTIONS_INGEST_ADAPTER(defaultsynthetic) — options ingest adapter:synthetic,alpaca,ibkr,databento.EQUITIES_INGEST_ADAPTER(defaultsynthetic) — equities ingest adapter.EMIT_INTERVAL_MS(default1000) — synthetic equities emit cadence.
Alpaca options adapter (dev-only)
Provider links: Alpaca, Alpaca Market Data API.
ALPACA_KEY_ID,ALPACA_SECRET_KEY— credentials.ALPACA_REST_URL(defaulthttps://data.alpaca.markets) — REST endpoint.ALPACA_WS_BASE_URL(defaultwss://stream.data.alpaca.markets/v1beta1) — streaming endpoint.ALPACA_FEED(defaultindicative) — useoprawhen you have a subscription.ALPACA_UNDERLYINGS(defaultSPY,NVDA,AAPL) — comma-separated list of symbols.ALPACA_STRIKES_PER_SIDE(default8) — strikes per side around ATM.ALPACA_MAX_DTE_DAYS(default30) — expiry horizon.ALPACA_MONEYNESS_PCT(default0.06) — ATM band for strike selection.ALPACA_MONEYNESS_FALLBACK_PCT(default0.1) — fallback band if strikes are sparse.ALPACA_MAX_QUOTES(default200) — subscription size guardrail.
Databento historical replay adapter
Provider links: Databento, Databento API.
DATABENTO_API_KEY— API key.DATABENTO_DATASET(defaultOPRA.PILLAR) — dataset.DATABENTO_SCHEMA(defaulttrades) — schema.DATABENTO_START— ISO date/time start for replay.DATABENTO_END— ISO date/time end (optional).DATABENTO_SYMBOLS(defaultSPY.OPT) — comma list or dataset symbols.DATABENTO_STYPE_IN(defaultparent) — input symbology type.DATABENTO_STYPE_OUT(defaultinstrument_id) — output symbology type.DATABENTO_LIMIT(default0) — record cap (0 means no cap).DATABENTO_PRICE_SCALE(default1) — divide raw price by this value.DATABENTO_PYTHON_BIN(defaultpy/.venv/bin/python) — Python executable for replay sidecar.
IBKR options adapter (Python sidecar)
Provider links: Interactive Brokers, IBKR API docs.
IBKR_HOST(default127.0.0.1) — TWS/Gateway host.IBKR_PORT(default7497) — TWS/Gateway port.IBKR_CLIENT_ID(default0) — API client ID.IBKR_SYMBOL(defaultSPY) — underlying symbol.IBKR_EXPIRY(default20250117) — expiry inYYYYMMDD.IBKR_STRIKE(default450) — strike price.IBKR_RIGHT(defaultC) — option right (CorP).IBKR_EXCHANGE(defaultSMART) — exchange route.IBKR_CURRENCY(defaultUSD) — currency.IBKR_PYTHON_BIN(defaultpython3) — Python executable for sidecar.
Compute + market-structure tuning
COMPUTE_DELIVER_POLICY(defaultnew) — consumer start behavior (neworall).COMPUTE_CONSUMER_RESET(defaultfalse) — force consumer reset (skip backlog).NBBO_MAX_AGE_MS(default1000) — max allowed NBBO age for joins.NEXT_PUBLIC_NBBO_MAX_AGE_MS(default1000) — UI-visible NBBO age for display gating.ROLLING_WINDOW_SIZE(default50) — rolling stats window length.ROLLING_TTL_SEC(default86400) — rolling stats TTL in seconds.
Classifier thresholds
CLASSIFIER_SWEEP_MIN_PREMIUM(default40000) — absolute sweep premium floor.CLASSIFIER_SWEEP_MIN_COUNT(default3) — minimum leg count for sweeps.CLASSIFIER_SWEEP_MIN_PREMIUM_Z(default2) — sweep premium z-score threshold.CLASSIFIER_SPIKE_MIN_PREMIUM(default20000) — absolute spike premium floor.CLASSIFIER_SPIKE_MIN_SIZE(default400) — absolute spike size floor.CLASSIFIER_SPIKE_MIN_PREMIUM_Z(default2.5) — spike premium z-score threshold.CLASSIFIER_SPIKE_MIN_SIZE_Z(default2) — spike size z-score threshold.CLASSIFIER_Z_MIN_SAMPLES(default12) — minimum samples before z-scores apply.CLASSIFIER_MIN_NBBO_COVERAGE(default0.5) — NBBO coverage ratio gate.CLASSIFIER_MIN_AGGRESSOR_RATIO(default0.55) — aggressor ratio gate.CLASSIFIER_0DTE_MAX_ATM_PCT(default0.01) — max ATM distance as pct of underlying for 0DTE gamma punch.CLASSIFIER_0DTE_MIN_PREMIUM(default20000) — 0DTE gamma punch premium floor.CLASSIFIER_0DTE_MIN_SIZE(default400) — 0DTE gamma punch size floor.
Testing + throttling
TESTING_MODE(defaultfalse) — enable ingest throttling for local dev.TESTING_THROTTLE_MS(default200) — minimum spacing between emitted prints.
Testing mode (throttles ingest to reduce CPU):
TESTING_MODE=trueenables throttlingTESTING_THROTTLE_MS=200minimum spacing between emitted prints (per ingest service)
IBKR adapter (options, via Python ib_insync):
- Install Python deps:
python3 -m pip install -r services/ingest-options/py/requirements.txt - Set
OPTIONS_INGEST_ADAPTER=ibkrand configure:IBKR_HOST,IBKR_PORT,IBKR_CLIENT_IDIBKR_SYMBOL,IBKR_EXPIRY(YYYYMMDD),IBKR_STRIKE,IBKR_RIGHT- Optional:
IBKR_EXCHANGE(defaultSMART),IBKR_CURRENCY(defaultUSD),IBKR_PYTHON_BIN
Alpaca adapter (options, dev-only bridge):
- Set
OPTIONS_INGEST_ADAPTER=alpacaand configure:ALPACA_KEY_ID,ALPACA_SECRET_KEYALPACA_UNDERLYINGS(comma-separated, defaultSPY,NVDA,AAPL)- Optional:
ALPACA_FEED(indicativedefault,oprawith subscription) - Optional:
ALPACA_MAX_QUOTES(default200),ALPACA_REST_URL,ALPACA_WS_BASE_URL - Optional selection tuning:
ALPACA_STRIKES_PER_SIDE(default8),ALPACA_MAX_DTE_DAYS(default30),ALPACA_MONEYNESS_PCT(default0.06),ALPACA_MONEYNESS_FALLBACK_PCT(default0.10)
Alpaca selection policy (dev-only, deterministic):
- Pick nearest weekly and nearest monthly expiries within 30 DTE (fallback to earliest expiries if missing)
- For each expiry, select 8 strikes per side closest to ATM within ±6% (fallback to ±10% if needed)
- Subscriptions are built once at startup to keep the stream bounded and repeatable
Databento historical replay adapter (options, via Python databento):
- Install Python deps:
python3 -m pip install -r services/ingest-options/py/requirements.txt - Set
OPTIONS_INGEST_ADAPTER=databentoand configure:DATABENTO_API_KEY,DATABENTO_START(ISO date/time)- Optional:
DATABENTO_END,DATABENTO_DATASET(defaultOPRA.PILLAR),DATABENTO_SCHEMA(defaulttrades) - Optional:
DATABENTO_SYMBOLS(ALLor comma list),DATABENTO_STYPE_IN/DATABENTO_STYPE_OUT(defaultraw_symbol) - Optional:
DATABENTO_LIMIT(record cap),DATABENTO_PRICE_SCALE(divide raw price),DATABENTO_PYTHON_BIN
- This adapter replays historical data only; live capture will be added later.
Run tests:
bun test
Status
Active build for personal, non-delayed analytical use. Multi-user access and redistribution are intentionally out of scope.
Non-Goals
- No black-box AI predictions
- No profit guarantees
- No real-time data redistribution
- No guessing at intent without evidence
License / Usage
For research and personal analytical use.
Market data usage is subject to the terms of the data providers.