islandflow/docs/implementation/synthetic-market-data/04-replay-integration.md
dirtydishes b1b18d4f17
All checks were successful
CI / Validate (pull_request) Successful in 1m15s
tighten implementation planning traceability
2026-06-16 14:18:23 -04:00

87 lines
4 KiB
Markdown

# Synthetic Market-Data Phase 04: Replay Integration
## Purpose
Make replay consume synthetic runs deterministically, either directly from generated fixtures or from materialized storage rows, while preserving the same ordering semantics the real replay path uses.
## Why this phase comes now
Replay should not be wired to synthetic data until the generator, manifests, labels, and smart-flow hypothesis pipeline have stable semantics. At this point, replay can become a serious acceptance gate instead of a demo convenience.
## Source documents
- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
- Research architecture review copy: [`docs/research-docs/synthetic-data-architecture-review.md`](../../research-docs/synthetic-data-architecture-review.md)
These documents are rationale, not added scope. This phase implements only deterministic synthetic replay integration.
## Research basis
- Replay must preserve event-time ordering and deterministic run identity to prove derived behavior.
- Synthetic runs should be selectable by source and run metadata rather than ambient randomness.
- Optional ClickHouse/NATS materialization can exist later, but fast validation should remain infra-free.
## Deferred research ideas
- Historical replay-plus-mutation and calibrated replay benchmarks are future layers after synthetic replay semantics are stable.
## Dependencies on earlier phases
- `islandflow-259.1` - Synthetic deterministic spine
- `islandflow-259.2` - Manifests, fixtures, and CLI
- `islandflow-259.3` - Scenarios, labels, and expected outputs
- `islandflow-zxh.3` - Hypothesis scoring and abstention
## Likely files/modules touched
- `services/replay/src/`
- API replay routes in `services/api/`
- Replay-related shared types in `packages/types/`
- Optional fixture materialization helpers in `packages/storage/`
- Replay tests or golden comparison helpers
## In-scope work
- Add replay source/run selectors for synthetic runs.
- Support fixture-backed replay without infrastructure where practical.
- Preserve ordering by event time, ingest time, sequence, and stable event ID.
- Compare replayed derived outputs against manifest signatures or expected-output sections.
- Keep optional ClickHouse/NATS materialized replay tests behind non-default gates.
## Explicitly out-of-scope work
- Building new scenario labels.
- Reworking smart-flow scoring policy.
- Demo profile controls.
- Load testing.
- Historical calibration.
## Acceptance criteria
- Replay can select a synthetic source and `run_id`.
- Fixture-backed replay respects manifest ordering.
- Derived output signatures can be compared with expected manifests.
- Fast replay tests remain infra-free by default.
- Optional infra-backed tests are clearly named and gated.
## Test strategy
Start with fixture-backed replay ordering tests and manifest-signature comparisons. Add optional service-container or ClickHouse materialization tests only after the fast path is stable, and do not make those tests part of the default `bun test` requirement.
## Risks / design traps
- Creating a synthetic-only replay path with different ordering will hide bugs.
- Letting optional infra tests become default will slow or destabilize CI.
- Comparing full raw payloads everywhere may make tests brittle; use stable signatures where better.
- Replay selectors that are not run-scoped can mix synthetic and live data.
## Suggested future Codex implementation prompt
```text
Implement docs/implementation/synthetic-market-data/04-replay-integration.md for Beads issue islandflow-259.4. Add synthetic source/run replay support with stable ordering and manifest comparison. Do not add demo controls, load profiles, or historical calibration, and keep the fast test path infra-free.
```
## Matching Beads issue title/id
- `islandflow-259.4` - Synthetic market-data phase 04: replay integration