Track project docs
Remove docs from .gitignore and add AGENTS.md, PLAN.md, CODING_STYLE.md, and RESEARCH.md to the repository.
This commit is contained in:
parent
7996d00677
commit
82861408e4
5 changed files with 671 additions and 4 deletions
123
RESEARCH.md
Normal file
123
RESEARCH.md
Normal file
|
|
@ -0,0 +1,123 @@
|
|||
# RESEARCH.md — Signal Evaluation & Backtesting Discipline
|
||||
|
||||
This document defines **how research is conducted and interpreted** in this repository.
|
||||
|
||||
Its purpose is to **prevent self-deception**, not to slow exploration.
|
||||
|
||||
---
|
||||
|
||||
## Core Research Principles
|
||||
|
||||
- **No hindsight**
|
||||
- **No intent claims**
|
||||
- **No performance without context**
|
||||
- **No conclusions without uncertainty**
|
||||
|
||||
All results are provisional unless explicitly validated.
|
||||
|
||||
---
|
||||
|
||||
## Fact vs Inference
|
||||
|
||||
- Raw market data = **fact**
|
||||
- Classifiers, signals, and labels = **inference**
|
||||
|
||||
Inference must always:
|
||||
- reference its evidence
|
||||
- expose confidence
|
||||
- be stored separately from raw data
|
||||
|
||||
---
|
||||
|
||||
## Labeling Rules
|
||||
|
||||
Every evaluation must specify:
|
||||
- Time horizon (e.g. 5m, 15m, 60m)
|
||||
- Metric (return, vol expansion, MAE/MFE, etc.)
|
||||
- Directional vs volatility outcome
|
||||
- Entry definition (event time, next tick, next candle)
|
||||
|
||||
No implicit labels. Ever.
|
||||
|
||||
---
|
||||
|
||||
## Backtesting Constraints
|
||||
|
||||
- No lookahead bias
|
||||
- No survivorship bias
|
||||
- Use only data available **at the event timestamp**
|
||||
- OI snapshots must match the event date
|
||||
|
||||
Replay pipelines must mirror live pipelines exactly.
|
||||
|
||||
---
|
||||
|
||||
## Evaluation Metrics (minimum set)
|
||||
|
||||
At least one of:
|
||||
- Precision / recall
|
||||
- Hit rate vs baseline
|
||||
- Forward return distribution
|
||||
- Vol realized vs implied
|
||||
- Calibration (probability vs outcome)
|
||||
|
||||
Single “win rate” numbers are insufficient.
|
||||
|
||||
---
|
||||
|
||||
## Regime Awareness
|
||||
|
||||
Results must be contextualized by:
|
||||
- Market regime (trend / chop / high vol)
|
||||
- Time of day
|
||||
- DTE bucket
|
||||
- Liquidity conditions
|
||||
|
||||
If a signal only works sometimes, that’s still information.
|
||||
|
||||
---
|
||||
|
||||
## Threshold Tuning Rules
|
||||
|
||||
- Thresholds may be tuned **only** on a defined training window.
|
||||
- Validation must occur on disjoint data.
|
||||
- Never tune on the same period you report.
|
||||
|
||||
Document when thresholds change and why.
|
||||
|
||||
---
|
||||
|
||||
## Language Discipline
|
||||
|
||||
Allowed:
|
||||
- “suggests”
|
||||
- “consistent with”
|
||||
- “correlated with”
|
||||
- “higher likelihood”
|
||||
|
||||
Disallowed:
|
||||
- “smart money”
|
||||
- “institutional intent”
|
||||
- “guaranteed”
|
||||
- “predicts”
|
||||
|
||||
---
|
||||
|
||||
## Recording Results
|
||||
|
||||
For each classifier or hypothesis, record:
|
||||
- What was tested
|
||||
- What failed
|
||||
- What partially worked
|
||||
- What conditions mattered
|
||||
|
||||
Failed ideas are assets. Keep them.
|
||||
|
||||
---
|
||||
|
||||
## Final Reminder
|
||||
|
||||
This system is for **understanding behavior**, not for proving superiority.
|
||||
|
||||
If a result cannot survive replay, uncertainty, and explanation,
|
||||
it does not belong in production logic.
|
||||
Loading…
Add table
Add a link
Reference in a new issue