AutoMechInterp Docs

What This Benchmark Contributes

AutoMechInterp is a scientific reliability framework: it measures when mechanistic claims survive causal controls, robustness checks, and statistical policy, not just whether a pipeline can produce candidate explanations.

Core output Typed gate outcomes, evidence tiers, and workflow actions that are stable under deterministic reruns.

Independent Submission Workflow

Initialize bundle contract from template.
Run Stage-2 interventions and populate artifacts.
Evaluate and generate workflow-oriented submission review.
Use failed-gate remediations to plan next experiments.

Open full submission guide

Evidence Tiers

Tier	Decision
`cross_model_confirmed`	Ready to share with transfer evidence
`single_model_confirmed`	Ready to share in single-model setting
`causal_tested_unstable`	Hold and run robustness follow-up
`suggestive`	Exploratory only; run confirmatory split
`rejected`	Needs revision before claiming mechanism

Open-Source Project Assets

Stratified field-level findings and failure concentration analysis
Evaluator-agnostic and adaptive adversarial stress diagnostics
Runtime/cost envelope and exact reproducibility runbook
Claim bundle specification + compatibility vectors

Why this project matters

One Command Rebuild

python main/reproducibility_audit.py

This regenerates environment manifests, breadth summaries, field-level findings, stress outputs, runtime/cost diagnostics, and community-value reports under main/output/repro.