islandflow/PLAN.md
dirtydishes 82861408e4 Track project docs
Remove docs from .gitignore and add AGENTS.md, PLAN.md, CODING_STYLE.md, and RESEARCH.md to the repository.
2025-12-29 22:10:26 -05:00

265 lines
8.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# PLAN.md — Real-Time Options Flow & Off-Exchange Analysis Platform
## Purpose
Build a **real-time, non-delayed** market-flow analysis system for **personal use** that ingests options trades/quotes and equity prints, clusters raw activity into higher-level flow events, applies **explainable rule-first classifiers**, infers dark-pool-like behavior, and visualizes everything in a **TradingView-smooth** interface with full replay and backtesting.
---
## Non-Negotiables
- **Runtime & tooling:** Bun everywhere (services, scripts, dev, CI)
- **Language:** TypeScript
- **Frontend:** Next.js + React (App Router)
- **Realtime:** WebSockets (server → client)
- **Eventing:** NATS JetStream (default) or Redpanda (Kafka-compatible)
- **Storage:** ClickHouse (authoritative event log + analytics), Redis (hot state)
- **Charting:** TradingView Lightweight Charts + custom Canvas/WebGL overlays
- **Scope:** Personal, non-delayed use only (no redistribution)
---
## Guiding Principles
- **Explainability first:** every alert links to evidence and explicit logic.
- **Event-sourced:** raw and derived events are persisted and replayable.
- **Microstructure correctness:** conservative inference, explicit confidence.
- **Low latency UX:** smooth pan/zoom, minimal main-thread work.
- **Determinism:** live behavior equals replay behavior.
---
## High-Level Architecture
**Sources → Ingest → Event Bus → Compute → Storage → API/WS → UI**
- Sources: options trades/quotes (OPRA-derived via licensed source), equity trades/quotes (incl. off-exchange flags)
- Ingest services normalize and publish immutable events
- Compute clusters prints, computes rolling stats, runs classifiers, emits alerts and inferred events
- ClickHouse stores everything; Redis serves hot joins/baselines
- API/WS streams curated live data and serves historical queries
- Next.js UI renders live terminals and charts
---
## Monorepo Layout (Bun workspaces)
apps/
web/ # Next.js UI (flow, charts, alerts)
services/
ingest-options/ # Options feed adapters (trades + NBBO)
ingest-equities/ # Equity trades/quotes ingestion
compute/ # Clustering, stats, classifiers, inference
candles/ # Server-side candle aggregation
refdata/ # Symbols, chains, corp actions
eod-enricher/ # OI + metadata snapshots
api/ # REST + WebSocket gateway
packages/
types/ # Shared TS types + zod schemas
ui/ # Design system + motion primitives
chart/ # Chart wrappers + overlay renderers
---
## Core Event Schemas (canonical)
- `OptionPrint` `{ ts, option_contract_id, price, size, exchange, conditions }`
- `OptionNBBO` `{ ts, option_contract_id, bid, ask, bidSize, askSize }`
- `EquityPrint` `{ ts, underlying_id, price, size, exchange, offExchangeFlag }`
- `EquityQuote` `{ ts, underlying_id, bid, ask }`
- `FlowPacket` `{ id, members[], features{}, join_quality{} }`
- `ClassifierHit` `{ classifier_id, confidence, direction, explanations[] }`
- `AlertEvent` `{ score, severity, hits[], evidence_refs[] }`
- `InferredDarkEvent` `{ type, confidence, evidence_refs[] }`
All events include `{ source_ts, ingest_ts, seq, trace_id }`.
---
## Epic 1 — Repo Scaffold & Infra (Day 1)
**Build**
- Initialize Bun monorepo; Docker compose for ClickHouse, Redis, NATS.
- Shared config, logging (JSON), metrics hooks.
- Define zod schemas + TS types.
**Acceptance**
- `bun run dev` boots infra + empty services + web shell.
---
## Epic 2 — Realtime Ingestion (Days 24)
### Options Ingestor
- Adapter interface: connect/subscribe/onTrade/onNBBO.
- Normalize to `OptionPrint`/`OptionNBBO`; publish.
### Equity Ingestor
- Stream `EquityPrint`/`EquityQuote`; tag off-exchange when provided.
**Acceptance**
- Live events visible via CLI subscriber.
- Raw events persisted to ClickHouse.
---
## Epic 3 — Rolling Stats & Clustering (Days 47)
### Rolling Stats (Redis)
- Premium/size baselines (median/MAD or mean/std).
- Intraday curves; liquidity/spread penalties.
### Clustering
- Contract sweeps (2502000ms windows).
- Adjacent-strike ladders.
- Multi-leg detection (spreads/straddles/rolls).
**Acceptance**
- Deterministic `FlowPacket` emission; replayable.
---
## Epic 4 — Classifiers & Alert Scoring (Days 710)
Implement rule-first classifiers (each returns confidence + explanation):
1. Large Bullish Call Sweep
2. Large Bearish Put Sweep
3. Large Call Sell (overwrite)
4. Large Put Sell (put write)
5. Unusual Contract Spike (z-score)
6. New Position Likely (vol/OI)
7. Closing/Unwind Likely
8. 0DTE Gamma Punch (ATM)
9. Far-Dated Conviction (60DTE+)
10. Straddle/Strangle (long/short vol)
11. Vertical Spread (debit/credit)
12. Roll Up/Down/Out
13. Multi-Strike Ladder Accumulation
14. No Follow-Through / Absorbed
**Alert Scoring**
- Weighted score from premium, aggressor, z-scores, structure bonus minus noise.
- Throttles, dedupe, cooldowns.
**Acceptance**
- Alerts fire with human-readable “why”; unit tests per classifier.
---
## Epic 5 — Dark Pool Inference (Days 1012)
**Inference (derived, separate from raw)**
- Absorbed blocks
- Stealth accumulation
- Distribution
- Hidden liquidity zones
**Acceptance**
- `InferredDarkEvent` links to evidence windows and chart markers.
---
## Epic 6 — API & WebSockets (Days 1214)
- REST: queries for prints, packets, inferred events; candle ranges.
- WS: channels for live flow, alerts, equity prints, inferred events.
- Backpressure aware fan-out.
**Acceptance**
- Stable live streaming under load.
---
## Epic 7 — UI: Live Terminals & Workspaces (Days 1418)
- Live options flow terminal (virtualized, deep filters).
- Dark pool/off-exchange tape.
- Alerts center.
- Ticker workspace (flow + chart + tape).
- Tangible UX: motion, depth, blueprint grid.
**Acceptance**
- 60fps interaction while streaming.
---
## Epic 8 — Charting (Days 1824)
**Base**
- TradingView Lightweight Charts (candles, volume, crosshair).
**Overlays**
- Off-exchange prints as circles (radius ~ sqrt(size)).
- Inferred events & classifier markers.
- Viewport-driven rendering; OffscreenCanvas when available.
**Acceptance**
- TV-smooth pan/zoom; overlays stay aligned.
---
## Epic 9 — Replay & Backtesting (Days 2428)
- Replay mode re-streams from ClickHouse.
- Metrics: forward returns, hit rates, calibration.
**Acceptance**
- Live and replay share the same pipeline.
---
## ADDENDUM — Missing-but-Crucial Epics
### Epic 10 — Reference Data, Symbology & Corporate Actions
- Underlyings, option chains, OCC adjustments, corp actions.
- Canonical IDs; symbol normalizer.
**Acceptance**
- Splits/adjustments dont break replay or joins.
### Epic 11 — EOD Enrichment (OI & Metadata)
- Nightly OI snapshots; provenance tagging.
**Acceptance**
- OI-based features reference correct snapshot date.
### Epic 12 — Time Sync, Ordering & Join Quality
- NTP/chrony; bounded join windows; join quality scores.
**Acceptance**
- Stable aggressor inference under replay.
### Epic 13 — Candle Aggregation Service
- Server-built 1s/5s/1m OHLCV; Redis hot cache.
**Acceptance**
- Candle queries <100ms for hot ranges.
### Epic 14 — Backpressure & Load Shedding
- Bounded queues; UI sampling; DLQ; replay namespace.
**Acceptance**
- System degrades gracefully during spikes.
### Epic 15 — Observability
- Metrics, structured logs, tracing IDs.
**Acceptance**
- End-to-end lag visible in one dashboard.
### Epic 16 — Secure Personal Deployment
- Auth (single-user), TLS, rate limits; no public endpoints.
**Acceptance**
- Anonymous access blocked; VPS-safe.
### Epic 17 — UX State: Saved Filters, Workspaces, Hotkeys
- Presets, layouts, evidence panel.
**Acceptance**
- One-click reproducibility of setups.
---
## Milestones
- **MVP-1:** realtime options flow, clustering, 10+ classifiers, live terminal.
- **MVP-2:** off-exchange prints, inferred DP events, chart overlays, alerts.
- **MVP-2.5:** candles, observability, backpressure, auth.
- **MVP-3:** replay/backtesting, metrics dashboards.
---
## Non-Goals
- No black-box predictions
- No profit guarantees
- No real-time redistribution
## Notes
- Market data usage must comply with provider terms.
- All inference is probabilistic and labeled as such.