islandflow/docs/implementation/synthetic-market-data/00-roadmap.md

44 lines
3 KiB
Markdown

# Synthetic Market-Data Roadmap
This roadmap breaks `docs/plans/synthetic-market-data-architecture-review.md` into implementation-sized phases. The recommended direction is still Option B: extract deterministic synthetic generation into a first-class reusable engine while keeping the useful NATS, ClickHouse, compute, API, replay, and web stack.
## Source Documents
- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
- Research architecture review copy: [`docs/research-docs/synthetic-data-architecture-review.md`](../../research-docs/synthetic-data-architecture-review.md)
The research documents are background and rationale only. Scope comes from the Beads issue and the phase document.
## Core Constraints
- Emit canonical market event types: `OptionPrint`, `OptionNBBO`, `EquityPrint`, and `EquityQuote`.
- Do not create synthetic-only market event types for the main pipeline.
- Keep hidden ground-truth labels separate from emitted market events.
- Keep early quality gates infra-free: `bun test` should not require Docker, ClickHouse, NATS, or Redis.
- Build deterministic foundations before demos, UI controls, or live synthetic service behavior.
- Treat historical calibration as future work, not as a dependency for the MVP synthetic generator.
## Phase Sequence
| Phase | Beads issue | Depends on | Purpose |
| --- | --- | --- | --- |
| 01 - Deterministic spine | `islandflow-259.1` | None | Create the seeded generation foundation and canonical event output contract. |
| 02 - Manifests, fixtures, CLI | `islandflow-259.2` | `islandflow-zxh.1` | Turn deterministic generation into durable fixtures and manifests. |
| 03 - Scenarios, labels, expected outputs | `islandflow-259.3` | `islandflow-zxh.2` | Author named scenarios, separate labels, and expected derived outputs. |
| 04 - Replay integration | `islandflow-259.4` | `islandflow-zxh.3` | Make replay consume synthetic runs with stable ordering and output comparison. |
| 05 - Demo and load profiles | `islandflow-259.5` | `islandflow-zxh.4` | Expose named deterministic demo/load profiles after replay validation. |
| 99 - Future historical calibration | `islandflow-259.6` | `islandflow-259.5` | Calibrate parameters from historical data later, after the MVP is stable. |
## PR Split Notes
Most phases are intended to fit in one focused PR. Phase 03 is already split into PR-sized Beads children because scenario authoring and expected-output comparison can grow quickly:
- `islandflow-259.3.1` - Split synthetic phase 03a: scenario catalog and labels
- `islandflow-259.3.2` - Split synthetic phase 03b: expected-output manifests
If any other phase starts touching unrelated service, API, UI, and storage behavior in one PR, split it before implementation continues.
## Matching Beads Epic
- `islandflow-259` - Plan synthetic market-data implementation phases