document research basis for phase plans

2026-06-16 13:53:54 -04:00 · 2026-06-16 13:53:54 -04:00 · 412c8b8af9
commit 412c8b8af9
parent eaa22de302
19 changed files with 332 additions and 4 deletions
--- a/docs/implementation/synthetic-market-data/03-scenarios-labels-expected-outputs.md
+++ b/docs/implementation/synthetic-market-data/03-scenarios-labels-expected-outputs.md
@ -8,6 +8,24 @@ Author named deterministic scenarios, separate ground-truth labels, and expected

 The generator and manifest layers should exist before scenario authoring. Smart-flow evidence clustering should also define enough vocabulary for expected outputs to describe evidence requirements without leaking labels into emitted market events.

+## Source documents
+
+- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
+- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
+- Smart-flow research report: [`docs/research-docs/smart-flow-market-mechanics.md`](../../research-docs/smart-flow-market-mechanics.md)
+
+These documents are rationale, not added scope. This phase implements only named scenarios, separate labels, and expected-output contracts.
+
+## Research basis
+
+- Scenario injection into a realistic synthetic background is mandatory for labeled, replayable alert tests.
+- Negative, noisy, stale, wide-market, and event-context cases matter as much as positive "should detect" scenarios.
+- Labels and expected outputs need required evidence, forbidden evidence, confidence bands, and false-positive penalties.
+
+## Deferred research ideas
+
+- Empirical tuning of scenario frequencies, full historical replay-plus-mutation, and learned scenario generation belong after the MVP scenario catalog is stable.
+
 ## Dependencies on earlier phases

 - `islandflow-259.1` - Synthetic deterministic spine