document research basis for phase plans

2026-06-16 13:53:54 -04:00 · 2026-06-16 13:53:54 -04:00 · 412c8b8af9
commit 412c8b8af9
parent eaa22de302
19 changed files with 332 additions and 4 deletions
--- a/docs/implementation/README.md
+++ b/docs/implementation/README.md
@ -2,7 +2,18 @@

 This directory is the active planning layer for the synthetic market-data and smart-money/smart-flow architecture work.

-The architecture reviews in `docs/plans/` are background guidance. Future implementation work should use the current phase document and matching Beads issue as the active scope. If a phase document and an older architecture review disagree, pause and update the phase document or Beads issue before writing code.
+The architecture reviews in `docs/plans/` and research reports in `docs/research-docs/` are background guidance. Future implementation work should use the current phase document and matching Beads issue as the active scope. If a phase document and an older architecture review or research report disagree, pause and update the phase document or Beads issue before writing code.
+
+## Document Precedence
+
+Use this precedence order when planning or implementing phase work:
+
+1. Beads issue
+2. Phase document in `docs/implementation/`
+3. Architecture plan in `docs/plans/`
+4. Research report in `docs/research-docs/`
+
+Research reports provide rationale and useful constraints. They do not add active implementation scope unless that scope is explicitly pulled into a phase document and Beads issue.

 ## Source Plans

--- a/docs/implementation/smart-money/00-roadmap.md
+++ b/docs/implementation/smart-money/00-roadmap.md
@ -2,6 +2,14 @@

 This roadmap breaks `docs/plans/smart-flow-architecture-review.md` into implementation-sized phases. The recommended direction is Option B: keep the working stack, but rebuild the domain pipeline around observations, evidence clusters, cautious hypotheses, confidence, alternatives, abstention, replay evaluation, and user-facing insight projections.

+## Source Documents
+
+- Architecture plan: [`docs/plans/smart-flow-architecture-review.md`](../../plans/smart-flow-architecture-review.md)
+- Research report: [`docs/research-docs/smart-flow-market-mechanics.md`](../../research-docs/smart-flow-market-mechanics.md)
+- Research architecture review copy: [`docs/research-docs/smart-flow-architecture-review.md`](../../research-docs/smart-flow-architecture-review.md)
+
+The research documents are background and rationale only. Scope comes from the Beads issue and the phase document.
+
 ## Core Constraints

 - Do not treat "smart money" as a canonical fact emitted by the system.
--- a/docs/implementation/smart-money/01-contracts-vocabulary.md
+++ b/docs/implementation/smart-money/01-contracts-vocabulary.md
@ -8,6 +8,23 @@ Introduce the domain vocabulary and contracts that distinguish observations, evi

 The current system has useful infrastructure but overconfident domain names. Before changing classifier behavior, the codebase needs the language to express what is observed, what is inferred, what is uncertain, and when the system should abstain.

+## Source documents
+
+- Architecture plan: [`docs/plans/smart-flow-architecture-review.md`](../../plans/smart-flow-architecture-review.md)
+- Research report: [`docs/research-docs/smart-flow-market-mechanics.md`](../../research-docs/smart-flow-market-mechanics.md)
+
+These documents are rationale, not added scope. This phase implements only vocabulary, contracts, versioning, and compatibility notes.
+
+## Research basis
+
+- The research direction is direct observation to inference to hypothesis, with preserved evidence and visible uncertainty.
+- "Smart money" should not be modeled as a canonical fact; user-facing insight should be a projection from evidence-backed hypotheses.
+- Confidence, conviction, alternatives, and abstention need separate language before behavior changes.
+
+## Deferred research ideas
+
+- Participant identity claims and research-grade calibration stay outside the vocabulary foundation.
+
 ## Dependencies on earlier phases

 - `islandflow-259.1` - Synthetic deterministic spine, so contract work can align with canonical raw event and provenance assumptions.
--- a/docs/implementation/smart-money/02-evidence-clustering-features.md
+++ b/docs/implementation/smart-money/02-evidence-clustering-features.md
@ -8,6 +8,23 @@ Make evidence extraction, eligibility, quote/context joins, clustering, and feat

 Contracts alone do not change behavior. This phase gives the system a clean evidence layer so later scoring can reason from auditable facts instead of a generic feature bag or overconfident classifier labels.

+## Source documents
+
+- Architecture plan: [`docs/plans/smart-flow-architecture-review.md`](../../plans/smart-flow-architecture-review.md)
+- Research report: [`docs/research-docs/smart-flow-market-mechanics.md`](../../research-docs/smart-flow-market-mechanics.md)
+
+These documents are rationale, not added scope. This phase implements only eligibility, evidence facts, clustering, and traceable features.
+
+## Research basis
+
+- Trade signing, quote context, sale conditions, stale quotes, wide markets, and event context all affect whether a print is usable evidence.
+- Evidence should preserve raw refs, eligibility decisions, quality signals, and negative context before any hypothesis is scored.
+- Ingest should normalize observations; signal policy belongs in explicit evidence/scoring stages.
+
+## Deferred research ideas
+
+- Full IV-surface modeling, broad news/FDA event feeds, and deep historical baselines can be added later when scoped.
+
 ## Dependencies on earlier phases

 - `islandflow-zxh.1` - Smart-flow contracts and vocabulary
--- a/docs/implementation/smart-money/03-hypothesis-scoring-abstention.md
+++ b/docs/implementation/smart-money/03-hypothesis-scoring-abstention.md
@ -8,6 +8,23 @@ Convert evidence clusters into cautious flow hypotheses with explicit score vect

 Scoring should wait until the system can represent evidence clearly and synthetic scenarios can describe expected positive, negative, and abstention cases. This phase is where the product stops acting like every signal is a confident "smart money" claim.

+## Source documents
+
+- Architecture plan: [`docs/plans/smart-flow-architecture-review.md`](../../plans/smart-flow-architecture-review.md)
+- Research report: [`docs/research-docs/smart-flow-market-mechanics.md`](../../research-docs/smart-flow-market-mechanics.md)
+
+These documents are rationale, not added scope. This phase implements only cautious hypothesis scoring, alternatives, penalties, and abstention.
+
+## Research basis
+
+- Premium concentration, sweep-like activity, IV movement, and equity confirmation support hypotheses only when evidence quality and context agree.
+- False positives from deep-ITM stock replacement, spreads/hedges, stale quotes, and event-driven flow need explicit penalties or abstention.
+- Confidence should reflect policy confidence in the evidence, not a claim of hidden participant identity.
+
+## Deferred research ideas
+
+- Empirical threshold tuning, historical calibration, and ML-based scoring stay future work until replay/golden validation exists.
+
 ## Dependencies on earlier phases

 - `islandflow-zxh.1` - Smart-flow contracts and vocabulary
--- a/docs/implementation/smart-money/04-replay-evaluation-golden-tests.md
+++ b/docs/implementation/smart-money/04-replay-evaluation-golden-tests.md
@ -8,6 +8,24 @@ Make deterministic replay and golden output comparison the acceptance gate for s

 Replay evaluation should come after synthetic replay can select stable runs and after hypothesis scoring has outputs worth validating. This phase turns architecture discipline into a repeatable test path.

+## Source documents
+
+- Architecture plan: [`docs/plans/smart-flow-architecture-review.md`](../../plans/smart-flow-architecture-review.md)
+- Research report: [`docs/research-docs/smart-flow-market-mechanics.md`](../../research-docs/smart-flow-market-mechanics.md)
+- Synthetic research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
+
+These documents are rationale, not added scope. This phase implements only deterministic replay evaluation and compact golden tests.
+
+## Research basis
+
+- Replay is the acceptance gate for derived smart-flow outputs because evidence and hypotheses must be reproducible.
+- Validation must include positive cases, false positives, noisy contexts, and abstentions.
+- Tests should avoid lookahead bias and compare stable signatures instead of brittle full-payload dumps.
+
+## Deferred research ideas
+
+- Historical backtesting windows, empirical calibration datasets, and broad benchmark reports belong in later calibration work.
+
 ## Dependencies on earlier phases

 - `islandflow-zxh.1` - Smart-flow contracts and vocabulary
--- a/docs/implementation/smart-money/05-api-ui-explainability.md
+++ b/docs/implementation/smart-money/05-api-ui-explainability.md
@ -8,6 +8,23 @@ Expose evidence-backed smart-flow outputs through API, websocket, and UI surface

 The presentation layer should wait until contracts, evidence, scoring, and replay evaluation are stable. Otherwise the UI will harden old overconfident language or teach users to trust unvalidated outputs.

+## Source documents
+
+- Architecture plan: [`docs/plans/smart-flow-architecture-review.md`](../../plans/smart-flow-architecture-review.md)
+- Research report: [`docs/research-docs/smart-flow-market-mechanics.md`](../../research-docs/smart-flow-market-mechanics.md)
+
+These documents are rationale, not added scope. This phase implements only API, websocket, and UI explainability surfaces for validated outputs.
+
+## Research basis
+
+- Users need to see evidence quality, confidence versus conviction, alternatives, and abstention instead of a single certainty label.
+- The research supports cautious smart-flow insight projections, not canonical "smart money" facts.
+- Why-not and penalty context are part of the product surface because false positives are central to the domain.
+
+## Deferred research ideas
+
+- Advanced explanatory analytics, learned confidence calibration, and broad catalyst intelligence should wait for future scoped work.
+
 ## Dependencies on earlier phases

 - `islandflow-zxh.1` - Smart-flow contracts and vocabulary
--- a/docs/implementation/smart-money/99-future-calibration.md
+++ b/docs/implementation/smart-money/99-future-calibration.md
@ -8,6 +8,23 @@ Plan future calibration of smart-flow confidence, policy thresholds, penalties,

 The architecture should leave room for calibration, but calibration should not block the MVP. The system first needs clean facts, evidence, hypotheses, and replayable evaluation before tuning can be meaningful.

+## Source documents
+
+- Architecture plan: [`docs/plans/smart-flow-architecture-review.md`](../../plans/smart-flow-architecture-review.md)
+- Research report: [`docs/research-docs/smart-flow-market-mechanics.md`](../../research-docs/smart-flow-market-mechanics.md)
+
+These documents are rationale, not added scope. This future phase is the place to turn research ideas into scoped calibration work after MVP.
+
+## Research basis
+
+- Historical validation should be time-of-day aware and avoid lookahead bias.
+- Baselines for "unusual" should account for ticker, tenor bucket, regime, and event-day exclusions.
+- Confidence, penalties, abstention, and alternatives need versioned policy outputs so calibration stays auditable.
+
+## Deferred research ideas
+
+- ML scoring, learned calibration, richer catalyst feeds, and large historical benchmark suites require separate future Beads scope.
+
 ## Dependencies on earlier phases

 - `islandflow-zxh.5` - Smart-flow API/UI explainability
--- a/docs/implementation/synthetic-market-data/00-roadmap.md
+++ b/docs/implementation/synthetic-market-data/00-roadmap.md
@ -2,6 +2,14 @@

 This roadmap breaks `docs/plans/synthetic-market-data-architecture-review.md` into implementation-sized phases. The recommended direction is still Option B: extract deterministic synthetic generation into a first-class reusable engine while keeping the useful NATS, ClickHouse, compute, API, replay, and web stack.

+## Source Documents
+
+- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
+- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
+- Research architecture review copy: [`docs/research-docs/synthetic-data-architecture-review.md`](../../research-docs/synthetic-data-architecture-review.md)
+
+The research documents are background and rationale only. Scope comes from the Beads issue and the phase document.
+
 ## Core Constraints

 - Emit canonical market event types: `OptionPrint`, `OptionNBBO`, `EquityPrint`, and `EquityQuote`.
--- a/docs/implementation/synthetic-market-data/01-deterministic-spine.md
+++ b/docs/implementation/synthetic-market-data/01-deterministic-spine.md
@ -8,6 +8,23 @@ Create the reusable deterministic foundation for synthetic market data. This pha

 Everything else depends on reproducible raw events. Manifests, labels, replay, demos, and smart-flow tests are only trustworthy if the same seed/profile bundle produces the same canonical market event stream every time.

+## Source documents
+
+- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
+- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
+
+These documents are rationale, not added scope. This phase implements only the deterministic spine described below.
+
+## Research basis
+
+- The research recommends a no-history-first, transparent, deterministic generator rather than historical replay as an MVP prerequisite.
+- The generator needs core market realism handles from the start: discrete ticks, varying spreads, clustered arrivals, heterogeneous sizes, quote/trade separation, and options-chain sparsity.
+- Full agent-based, limit-order-book, and generative-ML simulation are too heavy for the first foundation.
+
+## Deferred research ideas
+
+- Full LOB simulation, agent-based simulation, generative ML, and empirical calibration stay out of this phase.
+
 ## Dependencies on earlier phases

 None. This is the first synthetic phase.
--- a/docs/implementation/synthetic-market-data/02-manifests-fixtures-cli.md
+++ b/docs/implementation/synthetic-market-data/02-manifests-fixtures-cli.md
@ -8,6 +8,23 @@ Turn the deterministic generator into reusable artifacts: fixture files, run man

 The deterministic spine gives the repo stable raw events. The next step is to make those events durable and addressable so downstream phases can reference exact generated runs instead of recreating ad hoc local randomness.

+## Source documents
+
+- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
+- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
+
+These documents are rationale, not added scope. This phase implements only manifests, fixtures, and CLI support.
+
+## Research basis
+
+- Deterministic replay and reviewable artifacts are necessary for synthetic data to be useful as validation data, not just demo data.
+- Expected-output manifests should pin seed, profile, generator version, event hashes, and replay ordering.
+- Hidden labels must stay separate from market events so tests do not leak ground truth into production-like paths.
+
+## Deferred research ideas
+
+- Empirical residual resampling and historical-window bootstrapping are future artifact sources, not this CLI's first requirement.
+
 ## Dependencies on earlier phases

 - `islandflow-259.1` - Synthetic deterministic spine
--- a/docs/implementation/synthetic-market-data/03-scenarios-labels-expected-outputs.md
+++ b/docs/implementation/synthetic-market-data/03-scenarios-labels-expected-outputs.md
@ -8,6 +8,24 @@ Author named deterministic scenarios, separate ground-truth labels, and expected

 The generator and manifest layers should exist before scenario authoring. Smart-flow evidence clustering should also define enough vocabulary for expected outputs to describe evidence requirements without leaking labels into emitted market events.

+## Source documents
+
+- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
+- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
+- Smart-flow research report: [`docs/research-docs/smart-flow-market-mechanics.md`](../../research-docs/smart-flow-market-mechanics.md)
+
+These documents are rationale, not added scope. This phase implements only named scenarios, separate labels, and expected-output contracts.
+
+## Research basis
+
+- Scenario injection into a realistic synthetic background is mandatory for labeled, replayable alert tests.
+- Negative, noisy, stale, wide-market, and event-context cases matter as much as positive "should detect" scenarios.
+- Labels and expected outputs need required evidence, forbidden evidence, confidence bands, and false-positive penalties.
+
+## Deferred research ideas
+
+- Empirical tuning of scenario frequencies, full historical replay-plus-mutation, and learned scenario generation belong after the MVP scenario catalog is stable.
+
 ## Dependencies on earlier phases

 - `islandflow-259.1` - Synthetic deterministic spine
--- a/docs/implementation/synthetic-market-data/04-replay-integration.md
+++ b/docs/implementation/synthetic-market-data/04-replay-integration.md
@ -8,6 +8,23 @@ Make replay consume synthetic runs deterministically, either directly from gener

 Replay should not be wired to synthetic data until the generator, manifests, labels, and smart-flow hypothesis pipeline have stable semantics. At this point, replay can become a serious acceptance gate instead of a demo convenience.

+## Source documents
+
+- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
+- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
+
+These documents are rationale, not added scope. This phase implements only deterministic synthetic replay integration.
+
+## Research basis
+
+- Replay must preserve event-time ordering and deterministic run identity to prove derived behavior.
+- Synthetic runs should be selectable by source and run metadata rather than ambient randomness.
+- Optional ClickHouse/NATS materialization can exist later, but fast validation should remain infra-free.
+
+## Deferred research ideas
+
+- Historical replay-plus-mutation and calibrated replay benchmarks are future layers after synthetic replay semantics are stable.
+
 ## Dependencies on earlier phases

 - `islandflow-259.1` - Synthetic deterministic spine
--- a/docs/implementation/synthetic-market-data/05-demo-load-profiles.md
+++ b/docs/implementation/synthetic-market-data/05-demo-load-profiles.md
@ -8,6 +8,23 @@ Expose deterministic synthetic runs as named demo and load profiles after the ge

 Demos are useful only after the underlying data can be trusted. This phase deliberately waits until replay and golden evaluation prove the event semantics, so hosted controls do not become a front door to ambient randomness.

+## Source documents
+
+- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
+- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
+
+These documents are rationale, not added scope. This phase implements only named deterministic demo and load profiles.
+
+## Research basis
+
+- Demo streams should use named, seeded profiles so product behavior is reproducible.
+- Load profiles should scale rate or volume without changing event semantics.
+- Realism should come from the generator and scenarios, not hidden UI knobs or wall-clock randomness.
+
+## Deferred research ideas
+
+- Historically bootstrapped demo streams, learned realism upgrades, and full LOB-style demos stay future work.
+
 ## Dependencies on earlier phases

 - `islandflow-259.1` - Synthetic deterministic spine
--- a/docs/implementation/synthetic-market-data/99-future-historical-calibration.md
+++ b/docs/implementation/synthetic-market-data/99-future-historical-calibration.md
@ -8,6 +8,23 @@ Plan future calibration of synthetic generator parameters from historical market

 It is useful to name the future work now so early designs keep calibration hooks in mind. It should not come before deterministic generation, manifests, scenarios, replay, or demo profiles.

+## Source documents
+
+- Architecture plan: [`docs/plans/synthetic-market-data-architecture-review.md`](../../plans/synthetic-market-data-architecture-review.md)
+- Research report: [`docs/research-docs/synthetic-market-data-generation.md`](../../research-docs/synthetic-market-data-generation.md)
+
+These documents are rationale, not added scope. This future phase is the place to turn research ideas into scoped calibration work after MVP.
+
+## Research basis
+
+- Once historical data exists, calibration should fit arrival curves, spread states, size mixtures, venue shares, and options-chain activity weights.
+- Replay-plus-mutation can improve realism while preserving deterministic test intent.
+- Calibration should layer onto the deterministic engine rather than replace it wholesale.
+
+## Deferred research ideas
+
+- Generative ML, learned LOB simulators, and agent-based models remain later research tracks unless a future Beads issue scopes them explicitly.
+
 ## Dependencies on earlier phases

 - `islandflow-259.5` - Synthetic demo and load profiles