87 lines
4 KiB
Markdown
87 lines
4 KiB
Markdown
# Synthetic Market-Data Phase 04: Replay Integration
|
|
|
|
## Purpose
|
|
|
|
Make replay consume synthetic runs deterministically, either directly from generated fixtures or from materialized storage rows, while preserving the same ordering semantics the real replay path uses.
|
|
|
|
## Why this phase comes now
|
|
|
|
Replay should not be wired to synthetic data until the generator, manifests, labels, and smart-flow hypothesis pipeline have stable semantics. At this point, replay can become a serious acceptance gate instead of a demo convenience.
|
|
|
|
## Source documents
|
|
|
|
- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
|
|
- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
|
|
- Research architecture review copy: [`docs/research-docs/synthetic-data-architecture-review.md`](../../research-docs/synthetic-data-architecture-review.md)
|
|
|
|
These documents are rationale, not added scope. This phase implements only deterministic synthetic replay integration.
|
|
|
|
## Research basis
|
|
|
|
- Replay must preserve event-time ordering and deterministic run identity to prove derived behavior.
|
|
- Synthetic runs should be selectable by source and run metadata rather than ambient randomness.
|
|
- Optional ClickHouse/NATS materialization can exist later, but fast validation should remain infra-free.
|
|
|
|
## Deferred research ideas
|
|
|
|
- Historical replay-plus-mutation and calibrated replay benchmarks are future layers after synthetic replay semantics are stable.
|
|
|
|
## Dependencies on earlier phases
|
|
|
|
- `islandflow-259.1` - Synthetic deterministic spine
|
|
- `islandflow-259.2` - Manifests, fixtures, and CLI
|
|
- `islandflow-259.3` - Scenarios, labels, and expected outputs
|
|
- `islandflow-zxh.3` - Hypothesis scoring and abstention
|
|
|
|
## Likely files/modules touched
|
|
|
|
- `services/replay/src/`
|
|
- API replay routes in `services/api/`
|
|
- Replay-related shared types in `packages/types/`
|
|
- Optional fixture materialization helpers in `packages/storage/`
|
|
- Replay tests or golden comparison helpers
|
|
|
|
## In-scope work
|
|
|
|
- Add replay source/run selectors for synthetic runs.
|
|
- Support fixture-backed replay without infrastructure where practical.
|
|
- Preserve ordering by event time, ingest time, sequence, and stable event ID.
|
|
- Compare replayed derived outputs against manifest signatures or expected-output sections.
|
|
- Keep optional ClickHouse/NATS materialized replay tests behind non-default gates.
|
|
|
|
## Explicitly out-of-scope work
|
|
|
|
- Building new scenario labels.
|
|
- Reworking smart-flow scoring policy.
|
|
- Demo profile controls.
|
|
- Load testing.
|
|
- Historical calibration.
|
|
|
|
## Acceptance criteria
|
|
|
|
- Replay can select a synthetic source and `run_id`.
|
|
- Fixture-backed replay respects manifest ordering.
|
|
- Derived output signatures can be compared with expected manifests.
|
|
- Fast replay tests remain infra-free by default.
|
|
- Optional infra-backed tests are clearly named and gated.
|
|
|
|
## Test strategy
|
|
|
|
Start with fixture-backed replay ordering tests and manifest-signature comparisons. Add optional service-container or ClickHouse materialization tests only after the fast path is stable, and do not make those tests part of the default `bun test` requirement.
|
|
|
|
## Risks / design traps
|
|
|
|
- Creating a synthetic-only replay path with different ordering will hide bugs.
|
|
- Letting optional infra tests become default will slow or destabilize CI.
|
|
- Comparing full raw payloads everywhere may make tests brittle; use stable signatures where better.
|
|
- Replay selectors that are not run-scoped can mix synthetic and live data.
|
|
|
|
## Suggested future Codex implementation prompt
|
|
|
|
```text
|
|
Implement docs/implementation/synthetic-market-data/04-replay-integration.md for Beads issue islandflow-259.4. Add synthetic source/run replay support with stable ordering and manifest comparison. Do not add demo controls, load profiles, or historical calibration, and keep the fast test path infra-free.
|
|
```
|
|
|
|
## Matching Beads issue title/id
|
|
|
|
- `islandflow-259.4` - Synthetic market-data phase 04: replay integration
|