Plan Document

Evidence-Backed Smart-Flow Detection

A readable architecture review for reshaping Islandflow's smart-flow system around direct observation, evidence clusters, cautious hypotheses, preserved uncertainty, and replayable validation.

Smart flow Architecture review Refactor, not rewrite Recommended: Option B Roadmap HTML Phase dossier

Jump to

Summary

No source code was modified as part of the architecture review. The conclusion is direct: the current architecture is not suitable as-is, but it is close enough to refactor. The stack is right; the domain language and pipeline shape are not.

The research direction should be direct observation to inference to hypothesis, with preserved evidence and visible uncertainty. The system should stop emitting "smart money" as if it is a fact, and instead emit cautious, explainable smart-flow hypotheses.

Source Documents

Area Classification

Area Call Architecture Review
Domain model refactor Good bones, wrong center. Make evidence, hypotheses, scores, and alternatives first-class.
Event taxonomy refactor Raw/derived split is good; smart_money, dark.inferred, and classifier_hits leak overconfident product language.
Service boundaries refactor Ingest does too much signal policy; compute is too broad. Split pipeline stages before adding more intelligence.
FlowPacket refactor Keep concept, rename/reframe as FlowEvidenceCluster or FlowCandidate. Not a product domain object.
SmartMoneyEvent redesign Replace canonical object with FlowHypothesisEvent; use SmartFlowInsight only as UI/API projection.
Classifier pipeline redesign Current rules mix evidence extraction, hypothesis scoring, narrative labels, and alerting. Needs staged outputs.
ClickHouse/storage refactor Right datastore; raw tables are decent, derived evidence/hypotheses need typed/queryable columns plus JSON sidecars.
Redis baselines/cache refactor Right hot-state role; wrong as hidden baseline truth. Baselines need replayable snapshots/versioning.
NATS/JetStream subjects refactor Right bus; subjects should express stage/version: observations, evidence, hypotheses, insights.
Replay determinism redesign Present but not central enough. Replay must be the acceptance gate for derived outputs.
API/WebSocket refactor Mechanics are good; public surface should expose evidence bundles and hypotheses, not internal legacy names.
UI evidence model refactor Directionally good, but still foregrounds profile/probability over evidence quality, alternatives, and uncertainty.
Test strategy redesign Unit tests are solid scaffolding; needs fixture replay, false-positive suites, calibration, and end-to-end determinism.

Direct Answers

  1. 01

    Current suitability: no. Useful infrastructure, but not yet an evidence-backed smart-flow architecture.

  2. 02

    SmartMoneyEvent: not a good canonical domain object. Use FlowHypothesisEvent. ParticipantHypothesisEvent implies participant identity too strongly. SmartFlowInsight should be a user-facing projection.

  3. 03

    FlowPacket: not as named. Keep the abstraction as an internal evidence cluster, rename to FlowEvidenceCluster or FlowCandidate.

  4. 04

    Service boundaries: not right. Ingest should normalize only; evidence quality, eligibility, clustering, hypothesis scoring, and insight projection should be separate stages.

  5. 05

    ClickHouse/Redis/NATS roles: yes broadly. ClickHouse is the authoritative event/audit store. Redis is hot cache only. NATS is transport, not truth. All three need cleaner contracts.

  6. 06

    Replay central enough: no. It should be how every detection change proves itself.

  7. 07

    UI uncertainty: partially. It shows evidence refs, profile ladders, abstention, and suppression, but needs confidence vs conviction, alternative explanations, evidence quality, and why-not signals.

  8. 08

    First-class domain objects: raw observations, execution context, quote join, eligibility decision, evidence cluster, structure hypothesis, evidence quality score, baseline snapshot, hypothesis score vector, false-positive penalty, catalyst context, flow hypothesis event, smart-flow insight, replay run.

  9. 09

    Implementation details: Redis list layout, durable consumer names, current classifier thresholds, ClickHouse batch writer, adapter internals, legacy ClassifierHitEvent, alert severity math, UI cache mechanics.

  10. 10

    Delete/defer: canonical smart-money naming, real-time dark-pool certainty, standalone whale-premium alerts, trade-level open/close claims, participant identity claims, simplistic premium alert score, ingest-time signal filtering, retail_whale as a canonical profile unless reframed as attention/lottery flow.

Objects to Make First-Class

Raw Observation Execution Context Quote Join Eligibility Decision Flow Evidence Cluster Structure Hypothesis Evidence Quality Score Baseline Snapshot Hypothesis Score Vector False-Positive Penalty Catalyst Context Flow Hypothesis Event Smart-Flow Insight Replay Run

Options

Option A

Conservative

Keep current objects and services; add evidence-quality fields, UI copy fixes, and replay tests.

  • Pros

    Fastest, lowest migration risk, preserves current endpoints and UI.

  • Cons

    Leaves misleading canonical names and keeps inference tangled in compute.

  • Complexity

    Low.

  • Migration Risk

    Low.

  1. Rename UI copy from smart money to smart flow candidate.
  2. Add evidence-quality and alternative-explanation fields to existing event.
  3. Add replay consistency tests around current outputs.
  4. Add typed ClickHouse columns for high-value JSON fields.
  5. Deprecate, but do not remove, legacy classifier hit display.
Option C

Redesign

Start over with an event-sourced evidence engine and versioned, replayable policies.

  • Pros

    Cleanest long-term architecture and strongest research discipline.

  • Cons

    Slowest, overkill before product fit, and discards too much working infrastructure.

  • Complexity

    Very high.

  • Migration Risk

    High.

  1. Define new canonical event taxonomy and versioned policy registry.
  2. Build raw observation lake and deterministic replay runner first.
  3. Build evidence extraction and quote/condition eligibility services.
  4. Build cluster and structure hypothesis services.
  5. Build hypothesis scoring and calibration services.
  6. Build insight projection API.
  7. Rebuild terminal against new evidence/hypothesis contracts.
  8. Backfill or discard old derived data.

Recommendation

Choose Option B.

Option A is too timid for a pre-alpha product whose current names already fight the research. Option C is intellectually clean but wastes too much working infrastructure. Option B keeps the stack and terminal momentum while fixing the core mistake: treating smart money as a thing the system emits, instead of treating smart flow as a cautious, evidence-backed hypothesis with alternatives.

The first implementation move should be the contract/naming PR: introduce FlowHypothesisEvent and FlowEvidenceCluster with compatibility aliases, then make replay the gate before touching more classifier logic.