Track project docs

Remove docs from .gitignore and add AGENTS.md, PLAN.md, CODING_STYLE.md, and RESEARCH.md to the repository.
2025-12-29 22:10:26 -05:00 · 2025-12-29 22:10:26 -05:00 · 82861408e4
commit 82861408e4
parent 7996d00677
5 changed files with 671 additions and 4 deletions
--- a/RESEARCH.md
+++ b/RESEARCH.md
@ -0,0 +1,123 @@
+# RESEARCH.md — Signal Evaluation & Backtesting Discipline
+
+This document defines **how research is conducted and interpreted** in this repository.
+
+Its purpose is to **prevent self-deception**, not to slow exploration.
+
+---
+
+## Core Research Principles
+
+- **No hindsight**
+- **No intent claims**
+- **No performance without context**
+- **No conclusions without uncertainty**
+
+All results are provisional unless explicitly validated.
+
+---
+
+## Fact vs Inference
+
+- Raw market data = **fact**
+- Classifiers, signals, and labels = **inference**
+
+Inference must always:
+- reference its evidence
+- expose confidence
+- be stored separately from raw data
+
+---
+
+## Labeling Rules
+
+Every evaluation must specify:
+- Time horizon (e.g. 5m, 15m, 60m)
+- Metric (return, vol expansion, MAE/MFE, etc.)
+- Directional vs volatility outcome
+- Entry definition (event time, next tick, next candle)
+
+No implicit labels. Ever.
+
+---
+
+## Backtesting Constraints
+
+- No lookahead bias
+- No survivorship bias
+- Use only data available **at the event timestamp**
+- OI snapshots must match the event date
+
+Replay pipelines must mirror live pipelines exactly.
+
+---
+
+## Evaluation Metrics (minimum set)
+
+At least one of:
+- Precision / recall
+- Hit rate vs baseline
+- Forward return distribution
+- Vol realized vs implied
+- Calibration (probability vs outcome)
+
+Single “win rate” numbers are insufficient.
+
+---
+
+## Regime Awareness
+
+Results must be contextualized by:
+- Market regime (trend / chop / high vol)
+- Time of day
+- DTE bucket
+- Liquidity conditions
+
+If a signal only works sometimes, that’s still information.
+
+---
+
+## Threshold Tuning Rules
+
+- Thresholds may be tuned **only** on a defined training window.
+- Validation must occur on disjoint data.
+- Never tune on the same period you report.
+
+Document when thresholds change and why.
+
+---
+
+## Language Discipline
+
+Allowed:
+- “suggests”
+- “consistent with”
+- “correlated with”
+- “higher likelihood”
+
+Disallowed:
+- “smart money”
+- “institutional intent”
+- “guaranteed”
+- “predicts”
+
+---
+
+## Recording Results
+
+For each classifier or hypothesis, record:
+- What was tested
+- What failed
+- What partially worked
+- What conditions mattered
+
+Failed ideas are assets. Keep them.
+
+---
+
+## Final Reminder
+
+This system is for **understanding behavior**, not for proving superiority.
+
+If a result cannot survive replay, uncertainty, and explanation,
+it does not belong in production logic.