From 4506ed7ffa47418f493e32cfc7c80cfee27923ec Mon Sep 17 00:00:00 2001 From: dirtydishes Date: Wed, 17 Jun 2026 13:36:23 -0400 Subject: [PATCH] add readable implementation roadmap docs --- docs/implementation/index.html | 543 ++++++++++++ .../smart-money/00-roadmap.html | 502 +++++++++++ docs/implementation/smart-money/index.html | 3 + .../synthetic-market-data/00-roadmap.html | 481 +++++++++++ .../synthetic-market-data/index.html | 3 + .../plans/smart-flow-architecture-review.html | 814 ++++++++++++++++++ ...hetic-market-data-architecture-review.html | 2 + 7 files changed, 2348 insertions(+) create mode 100644 docs/implementation/index.html create mode 100644 docs/implementation/smart-money/00-roadmap.html create mode 100644 docs/implementation/synthetic-market-data/00-roadmap.html create mode 100644 docs/plans/smart-flow-architecture-review.html diff --git a/docs/implementation/index.html b/docs/implementation/index.html new file mode 100644 index 0000000..bfb9380 --- /dev/null +++ b/docs/implementation/index.html @@ -0,0 +1,543 @@ + + + + + + Implementation Phase Plans + + + +
+
+
+

Implementation Map

+

Implementation Phase Plans

+

+ The active planning layer for synthetic market-data and smart-money/smart-flow architecture work. + Architecture reviews and research reports are background; phase documents and Beads issues define execution scope. +

+
+ Active scope guide + Synthetic market data + Smart flow +
+
+ +
+ + + +
+

Document Precedence

+
+
+
    +
  1. Current Beads issue
  2. +
  3. Referenced phase document under docs/implementation/
  4. +
  5. Architecture plan under docs/plans/
  6. +
  7. Research report under docs/research-docs/
  8. +
+

+ This repository uses docs/research-docs/ for research reports; docs/research/ + is not present. Research reports provide rationale and useful constraints, but they do not add active + implementation scope unless that scope is explicitly pulled into a phase document and Beads issue. +

+
+
+
+ +
+

Source Plans

+
+ + +
+
+ +
+

Planning Rules

+
+
+
    +
  • Prefer small, reviewable PRs.
  • +
  • Do not implement an entire architecture plan at once.
  • +
  • Use Beads issues for execution tracking and dependency management.
  • +
  • Keep durable architecture and phase detail in these docs, not in long Beads descriptions.
  • +
  • Synthetic data must emit canonical market event types, not synthetic-only pipeline event types.
  • +
  • Synthetic labels must remain separate from emitted market events.
  • +
  • Smart-flow logic must distinguish facts, evidence, hypotheses, confidence, and abstention.
  • +
  • Historical calibration is future work, not an MVP dependency.
  • +
  • Early synthetic tests must not require Docker, ClickHouse, NATS, or Redis.
  • +
  • Synthetic foundations should come before demos, UI controls, or live service work.
  • +
+
+
+
+ +
+

Beads Map

+
+ + + + + + + + + + + + + + + + + + + + +
StreamEpicRoadmap
Synthetic market dataislandflow-259 - Plan synthetic market-data implementation phasesdocs/implementation/synthetic-market-data/00-roadmap.html
Smart money / smart flowislandflow-zxh - Plan smart-money to smart-flow implementation phasesdocs/implementation/smart-money/00-roadmap.html
+
+
+ +
+

Dependency Order

+
+ + + + + + + + + + + + + + + + + + + + + +
OrderPhaseBeads issueBlocks next because
1ASynthetic deterministic spineislandflow-259.1Establishes seeded raw event generation and provenance assumptions for later synthetic work.
1BSmart-flow contracts and vocabularyislandflow-zxh.1Can safely run in parallel with synthetic phase 01; defines evidence/hypothesis language before scoring work.
2Synthetic manifests, fixtures, and CLIislandflow-259.2Evidence clustering needs deterministic fixtures before broad behavior changes.
3Smart-flow evidence clustering and featuresislandflow-zxh.2Scenario labels need the evidence vocabulary they are expected to exercise.
4Synthetic scenarios, labels, and expected outputsislandflow-259.3Hypothesis scoring needs labeled positive, negative, and abstention cases.
5Smart-flow hypothesis scoring and abstentionislandflow-zxh.3Synthetic replay integration should validate the derived hypothesis pipeline.
6Synthetic replay integrationislandflow-259.4Smart-flow golden tests need replayable synthetic runs.
7Smart-flow replay evaluation and golden testsislandflow-zxh.4Demos should wait until replay proves the semantics.
8Synthetic demo and load profilesislandflow-259.5API/UI explainability should show stable, named, deterministic runs.
9Smart-flow API/UI explainabilityislandflow-zxh.5Final MVP presentation layer after the evidence pipeline is validated.
+
+
+ +
+

Future Work

+
+
+
+

Synthetic historical calibration

+

islandflow-259.6 depends on synthetic phase 05, but is not required for MVP.

+
+
+
+
+

Smart-flow calibration

+

islandflow-zxh.6 depends on smart-flow phase 05 and synthetic future calibration, but is not required for MVP.

+
+
+
+
+ + + +
+ HTML companion for docs/implementation/README.md. Source Markdown remains the active editable document. +
+
+ + diff --git a/docs/implementation/smart-money/00-roadmap.html b/docs/implementation/smart-money/00-roadmap.html new file mode 100644 index 0000000..a783b79 --- /dev/null +++ b/docs/implementation/smart-money/00-roadmap.html @@ -0,0 +1,502 @@ + + + + + + Smart Money / Smart Flow Roadmap + + + +
+
+
+

Smart Flow Roadmap

+

Smart Money / Smart Flow Roadmap

+

+ Implementation-sized phases for turning smart-money detection into smart-flow inference: + observations, evidence clusters, cautious hypotheses, confidence, alternatives, abstention, + replay evaluation, and user-facing insight projections. +

+
+ Beads islandflow-zxh + Recommended: Option B + MVP before calibration +
+
+ +
+ + + + + +
+

Core Constraints

+
+
+
    +
  • Do not treat "smart money" as a canonical fact emitted by the system.
  • +
  • Distinguish direct facts, evidence, hypotheses, confidence, alternatives, and abstention.
  • +
  • Preserve evidence and uncertainty in storage, API, websocket, and UI surfaces.
  • +
  • Keep Redis as hot cache only, not hidden baseline truth.
  • +
  • Make replay evaluation the acceptance gate before expanding UI confidence.
  • +
  • Keep historical or research-grade calibration as future work, not an MVP dependency.
  • +
+
+
+
+ +
+

Phase Sequence

+
+ + + + + + + + + + + + + + + + + +
PhaseBeads issueDepends onPurpose
01 - Contracts and vocabularyislandflow-zxh.1None; safe parallel with islandflow-259.1Define evidence/hypothesis/insight contracts and retire canonical overconfidence.
02 - Evidence clustering and featuresislandflow-zxh.2islandflow-259.2Extract eligibility, evidence facts, clusters, and traceable features.
03 - Hypothesis scoring and abstentionislandflow-zxh.3islandflow-259.3Score cautious hypotheses and represent abstention/alternatives.
04 - Replay evaluation and golden testsislandflow-zxh.4islandflow-259.4Validate derived outputs through deterministic replay and golden fixtures.
05 - API/UI explainabilityislandflow-zxh.5islandflow-259.5Expose evidence-backed insights and uncertainty to API, WS, and UI.
99 - Future calibrationislandflow-zxh.6islandflow-zxh.5, islandflow-259.6Calibrate confidence and policy behavior later with richer datasets.
+
+
+ +
+

PR Split Notes

+
+
+ Phase 02a +

islandflow-zxh.2.1 - Eligibility and evidence facts

+

Split out the direct fact and eligibility layer before clustering and feature vector work.

+
+
+ Phase 02b +

islandflow-zxh.2.2 - Clustering and feature vectors

+

Keep clustering and feature vector changes reviewable after the evidence vocabulary exists.

+
+
+ Phase 03a +

islandflow-zxh.3.1 - Hypothesis score vectors

+

Build scoring as a separate semantic layer, not as UI-ready certainty.

+
+
+ Phase 03b +

islandflow-zxh.3.2 - Abstention and insight projection

+

Represent alternatives, penalties, and abstention before exposing user-facing insight projections.

+
+
+ Phase 05a +

islandflow-zxh.5.1 - Evidence API and websocket surfaces

+

Expose evidence-backed contracts through transport before tuning the presentation layer.

+
+
+ Phase 05b +

islandflow-zxh.5.2 - UI explainability surfaces

+

Show evidence quality, confidence vs conviction, alternatives, abstention, and catalyst/noise context.

+
+
+

+ If an implementation PR crosses contracts, compute, storage, API, and UI in one change, stop and split it. +

+
+ +
+

Matching Beads Epic

+
+
+

islandflow-zxh - Plan smart-money to smart-flow implementation phases.

+
+
+
+ +
+ HTML companion for docs/implementation/smart-money/00-roadmap.md. Source Markdown remains the active editable document. +
+
+ + diff --git a/docs/implementation/smart-money/index.html b/docs/implementation/smart-money/index.html index 0c6ba0e..7e6bfda 100644 --- a/docs/implementation/smart-money/index.html +++ b/docs/implementation/smart-money/index.html @@ -411,6 +411,9 @@ tr:last-child td { border-bottom: 0; } Beads islandflow-zxh 7 source docs No app code + Implementation overview + Roadmap HTML + Architecture HTML diff --git a/docs/implementation/synthetic-market-data/00-roadmap.html b/docs/implementation/synthetic-market-data/00-roadmap.html new file mode 100644 index 0000000..71978eb --- /dev/null +++ b/docs/implementation/synthetic-market-data/00-roadmap.html @@ -0,0 +1,481 @@ + + + + + + Synthetic Market-Data Roadmap + + + +
+
+
+

Synthetic Roadmap

+

Synthetic Market-Data Roadmap

+

+ Implementation-sized phases for extracting deterministic synthetic generation into a first-class reusable engine + while keeping the useful NATS, ClickHouse, compute, API, replay, and web stack. +

+
+ Beads islandflow-259 + Recommended: Option B + Infra-free early gates +
+
+ +
+ + + + + +
+

Core Constraints

+
+
+
    +
  • Emit canonical market event types: OptionPrint, OptionNBBO, EquityPrint, and EquityQuote.
  • +
  • Do not create synthetic-only market event types for the main pipeline.
  • +
  • Keep hidden ground-truth labels separate from emitted market events.
  • +
  • Keep early quality gates infra-free: bun test should not require Docker, ClickHouse, NATS, or Redis.
  • +
  • Build deterministic foundations before demos, UI controls, or live synthetic service behavior.
  • +
  • Treat historical calibration as future work, not as a dependency for the MVP synthetic generator.
  • +
+
+
+
+ +
+

Phase Sequence

+
+ + + + + + + + + + + + + + + + + +
PhaseBeads issueDepends onPurpose
01 - Deterministic spineislandflow-259.1NoneCreate the seeded generation foundation and canonical event output contract.
02 - Manifests, fixtures, CLIislandflow-259.2islandflow-zxh.1Turn deterministic generation into durable fixtures and manifests.
03 - Scenarios, labels, expected outputsislandflow-259.3islandflow-zxh.2Author named scenarios, separate labels, and expected derived outputs.
04 - Replay integrationislandflow-259.4islandflow-zxh.3Make replay consume synthetic runs with stable ordering and output comparison.
05 - Demo and load profilesislandflow-259.5islandflow-zxh.4Expose named deterministic demo/load profiles after replay validation.
99 - Future historical calibrationislandflow-259.6islandflow-259.5Calibrate parameters from historical data later, after the MVP is stable.
+
+
+ +
+

PR Split Notes

+
+
+ Phase 03a +

islandflow-259.3.1 - Scenario catalog and labels

+

Keep scenario authoring and ground-truth label shape focused before expected-output comparison grows around it.

+
+
+ Phase 03b +

islandflow-259.3.2 - Expected-output manifests

+

Store expected derived outputs as reviewable artifacts for downstream smart-flow validation.

+
+
+

+ If any other phase starts touching unrelated service, API, UI, and storage behavior in one PR, split it before implementation continues. +

+
+ +
+

Matching Beads Epic

+
+
+

islandflow-259 - Plan synthetic market-data implementation phases.

+
+
+
+ +
+ HTML companion for docs/implementation/synthetic-market-data/00-roadmap.md. Source Markdown remains the active editable document. +
+
+ + diff --git a/docs/implementation/synthetic-market-data/index.html b/docs/implementation/synthetic-market-data/index.html index dcb2262..e85f553 100644 --- a/docs/implementation/synthetic-market-data/index.html +++ b/docs/implementation/synthetic-market-data/index.html @@ -411,6 +411,9 @@ tr:last-child td { border-bottom: 0; } Beads islandflow-259 7 source docs No app code + Implementation overview + Roadmap HTML + Architecture HTML diff --git a/docs/plans/smart-flow-architecture-review.html b/docs/plans/smart-flow-architecture-review.html new file mode 100644 index 0000000..826fa34 --- /dev/null +++ b/docs/plans/smart-flow-architecture-review.html @@ -0,0 +1,814 @@ + + + + + + Smart Flow Architecture Review + + + +
+
+
+

Plan Document

+

Evidence-Backed Smart-Flow Detection

+

+ A readable architecture review for reshaping Islandflow's smart-flow system around direct observation, + evidence clusters, cautious hypotheses, preserved uncertainty, and replayable validation. +

+
+ Smart flow + Architecture review + Refactor, not rewrite + Recommended: Option B + Roadmap HTML + Phase dossier +
+
+ +
+ + + +
+

Summary

+
+
+

+ No source code was modified as part of the architecture review. The conclusion is direct: + the current architecture is not suitable as-is, but it is close enough to refactor. + The stack is right; the domain language and pipeline shape are not. +

+

+ The research direction should be direct observation to inference to hypothesis, with preserved + evidence and visible uncertainty. The system should stop emitting "smart money" as if it is a + fact, and instead emit cautious, explainable smart-flow hypotheses. +

+
+
+
+ +
+

Source Documents

+
+ + + +
+
+ +
+

Area Classification

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
AreaCallArchitecture Review
Domain modelrefactorGood bones, wrong center. Make evidence, hypotheses, scores, and alternatives first-class.
Event taxonomyrefactorRaw/derived split is good; smart_money, dark.inferred, and classifier_hits leak overconfident product language.
Service boundariesrefactorIngest does too much signal policy; compute is too broad. Split pipeline stages before adding more intelligence.
FlowPacketrefactorKeep concept, rename/reframe as FlowEvidenceCluster or FlowCandidate. Not a product domain object.
SmartMoneyEventredesignReplace canonical object with FlowHypothesisEvent; use SmartFlowInsight only as UI/API projection.
Classifier pipelineredesignCurrent rules mix evidence extraction, hypothesis scoring, narrative labels, and alerting. Needs staged outputs.
ClickHouse/storagerefactorRight datastore; raw tables are decent, derived evidence/hypotheses need typed/queryable columns plus JSON sidecars.
Redis baselines/cacherefactorRight hot-state role; wrong as hidden baseline truth. Baselines need replayable snapshots/versioning.
NATS/JetStream subjectsrefactorRight bus; subjects should express stage/version: observations, evidence, hypotheses, insights.
Replay determinismredesignPresent but not central enough. Replay must be the acceptance gate for derived outputs.
API/WebSocketrefactorMechanics are good; public surface should expose evidence bundles and hypotheses, not internal legacy names.
UI evidence modelrefactorDirectionally good, but still foregrounds profile/probability over evidence quality, alternatives, and uncertainty.
Test strategyredesignUnit tests are solid scaffolding; needs fixture replay, false-positive suites, calibration, and end-to-end determinism.
+
+
+ +
+

Direct Answers

+
+
+
    +
  1. 01

    Current suitability: no. Useful infrastructure, but not yet an evidence-backed smart-flow architecture.

  2. +
  3. 02

    SmartMoneyEvent: not a good canonical domain object. Use FlowHypothesisEvent. ParticipantHypothesisEvent implies participant identity too strongly. SmartFlowInsight should be a user-facing projection.

  4. +
  5. 03

    FlowPacket: not as named. Keep the abstraction as an internal evidence cluster, rename to FlowEvidenceCluster or FlowCandidate.

  6. +
  7. 04

    Service boundaries: not right. Ingest should normalize only; evidence quality, eligibility, clustering, hypothesis scoring, and insight projection should be separate stages.

  8. +
  9. 05

    ClickHouse/Redis/NATS roles: yes broadly. ClickHouse is the authoritative event/audit store. Redis is hot cache only. NATS is transport, not truth. All three need cleaner contracts.

  10. +
  11. 06

    Replay central enough: no. It should be how every detection change proves itself.

  12. +
  13. 07

    UI uncertainty: partially. It shows evidence refs, profile ladders, abstention, and suppression, but needs confidence vs conviction, alternative explanations, evidence quality, and why-not signals.

  14. +
  15. 08

    First-class domain objects: raw observations, execution context, quote join, eligibility decision, evidence cluster, structure hypothesis, evidence quality score, baseline snapshot, hypothesis score vector, false-positive penalty, catalyst context, flow hypothesis event, smart-flow insight, replay run.

  16. +
  17. 09

    Implementation details: Redis list layout, durable consumer names, current classifier thresholds, ClickHouse batch writer, adapter internals, legacy ClassifierHitEvent, alert severity math, UI cache mechanics.

  18. +
  19. 10

    Delete/defer: canonical smart-money naming, real-time dark-pool certainty, standalone whale-premium alerts, trade-level open/close claims, participant identity claims, simplistic premium alert score, ingest-time signal filtering, retail_whale as a canonical profile unless reframed as attention/lottery flow.

  20. +
+
+
+
+ +
+

Objects to Make First-Class

+
+ Raw Observation + Execution Context + Quote Join + Eligibility Decision + Flow Evidence Cluster + Structure Hypothesis + Evidence Quality Score + Baseline Snapshot + Hypothesis Score Vector + False-Positive Penalty + Catalyst Context + Flow Hypothesis Event + Smart-Flow Insight + Replay Run +
+
+ +
+

Options

+
+
+
+ Option A +

Conservative

+

Keep current objects and services; add evidence-quality fields, UI copy fixes, and replay tests.

+
+
+
    +
  • Pros

    Fastest, lowest migration risk, preserves current endpoints and UI.

  • +
  • Cons

    Leaves misleading canonical names and keeps inference tangled in compute.

  • +
  • Complexity

    Low.

  • +
  • Migration Risk

    Low.

  • +
+
    +
  1. Rename UI copy from smart money to smart flow candidate.
  2. +
  3. Add evidence-quality and alternative-explanation fields to existing event.
  4. +
  5. Add replay consistency tests around current outputs.
  6. +
  7. Add typed ClickHouse columns for high-value JSON fields.
  8. +
  9. Deprecate, but do not remove, legacy classifier hit display.
  10. +
+
+
+ + + +
+
+ Option C +

Redesign

+

Start over with an event-sourced evidence engine and versioned, replayable policies.

+
+
+
    +
  • Pros

    Cleanest long-term architecture and strongest research discipline.

  • +
  • Cons

    Slowest, overkill before product fit, and discards too much working infrastructure.

  • +
  • Complexity

    Very high.

  • +
  • Migration Risk

    High.

  • +
+
    +
  1. Define new canonical event taxonomy and versioned policy registry.
  2. +
  3. Build raw observation lake and deterministic replay runner first.
  4. +
  5. Build evidence extraction and quote/condition eligibility services.
  6. +
  7. Build cluster and structure hypothesis services.
  8. +
  9. Build hypothesis scoring and calibration services.
  10. +
  11. Build insight projection API.
  12. +
  13. Rebuild terminal against new evidence/hypothesis contracts.
  14. +
  15. Backfill or discard old derived data.
  16. +
+
+
+
+
+ +
+

Recommendation

+
+

Choose Option B.

+

+ Option A is too timid for a pre-alpha product whose current names already fight the research. + Option C is intellectually clean but wastes too much working infrastructure. Option B keeps the + stack and terminal momentum while fixing the core mistake: treating smart money as a thing the + system emits, instead of treating smart flow as a cautious, evidence-backed hypothesis with alternatives. +

+

+ The first implementation move should be the contract/naming PR: introduce + FlowHypothesisEvent and FlowEvidenceCluster with compatibility aliases, + then make replay the gate before touching more classifier logic. +

+
+
+ +
+ HTML companion for docs/plans/smart-flow-architecture-review.md. Styled for the Islandflow product register. +
+
+ + diff --git a/docs/plans/synthetic-market-data-architecture-review.html b/docs/plans/synthetic-market-data-architecture-review.html index d1a40db..1823f52 100644 --- a/docs/plans/synthetic-market-data-architecture-review.html +++ b/docs/plans/synthetic-market-data-architecture-review.html @@ -521,6 +521,8 @@ Source: markdown review Mode: Plan Recommendation: Option B + Roadmap HTML + Phase dossier