Independent Submission Workflow
- Initialize bundle contract from template.
- Run Stage-2 interventions and populate artifacts.
- Evaluate and generate workflow-oriented submission review.
- Use failed-gate remediations to plan next experiments.
Deterministic verification for mechanistic-interpretability claims. Discovery systems propose; the evaluator checks evidence quality and returns clear outcomes for follow-up.
AutoMechInterp is a scientific reliability framework: it measures when mechanistic claims survive causal controls, robustness checks, and statistical policy, not just whether a pipeline can produce candidate explanations.
Core output Typed gate outcomes, evidence tiers, and workflow actions that are stable under deterministic reruns.
| Tier | Decision |
|---|---|
cross_model_confirmed | Ready to share with transfer evidence |
single_model_confirmed | Ready to share in single-model setting |
causal_tested_unstable | Hold and run robustness follow-up |
suggestive | Exploratory only; run confirmatory split |
rejected | Needs revision before claiming mechanism |
python main/reproducibility_audit.py
This regenerates environment manifests, breadth summaries, field-level findings, stress outputs, runtime/cost diagnostics, and community-value reports under main/output/repro.