Submitting A Claim Bundle¶
This page defines a practical, reproducible submission workflow.
Required files¶
protocol.yamlhypothesis.jsonlevaluation_result.jsonmanifest.json
Optional:
cross_model_results.json
Submission lifecycle¶
- Build bundle artifacts from discovery outputs.
- Validate schema and hash integrity.
- Run evaluator and report generation.
- Run deterministic submission review.
- Resolve failed gate clusters.
- Resubmit with updated evidence.
Mandatory validation commands¶
python -m automechinterp_evaluator.cli evaluate \
--bundle /path/to/bundle \
--output /path/to/bundle/result.json
python -m automechinterp_evaluator.cli report \
--bundle /path/to/bundle \
--output /path/to/bundle/stage_gate_report.md
python -m automechinterp_evaluator.cli submission-review \
--bundle /path/to/bundle \
--reruns 3 \
--output-json /path/to/bundle/submission_review.json \
--output-md /path/to/bundle/submission_review.md
What makes a submission strong¶
- complete confirmatory and exploratory slices where required
- coherent control evidence
- robust behavior across perturbations
- statistical checks that pass under declared policy
- deterministic rerun agreement
What usually blocks acceptance¶
| Failure family | Typical symptom | Suggested follow-up |
|---|---|---|
| confirmatory completeness | confirmatory_present fails |
regenerate confirmatory slice |
| method sensitivity | unstable outcomes across methods | tighten intervention protocol consistency |
| robustness | effect collapses under perturbations | add robustness-focused follow-up experiments |
| controls leakage | strong control effects | redesign controls and baseline comparisons |
| statistical rigor | CI/power/multiplicity failures | increase evidence quality and sample adequacy |
Submission package recommendation¶
When sharing externally, include:
- full bundle directory
submission_review.jsonsubmission_review.md- protocol hash/version
- environment manifest path