CLI Reference¶
This page summarizes the primary evaluator CLI commands and how they are typically used together.
Command map¶
| Command | Purpose | Typical output |
|---|---|---|
init-template |
scaffold minimal valid bundle | bundle directory |
evaluate |
compute machine-readable gate/tier outputs | result.json |
report |
render markdown summary report | stage_gate_report.md |
submission-review |
run deterministic reruns + workflow guidance | submission_review.json, submission_review.md |
reference-vectors |
emit compatibility vectors for cross-impl checks | reference vector files |
evaluate¶
python -m automechinterp_evaluator.cli evaluate \
--bundle /path/to/bundle \
--output /path/to/bundle/result.json
Use this as the canonical machine-readable record for downstream tooling.
report¶
python -m automechinterp_evaluator.cli report \
--bundle /path/to/bundle \
--output /path/to/bundle/stage_gate_report.md
Use this for human review, writeups, and triage meetings.
submission-review¶
python -m automechinterp_evaluator.cli submission-review \
--bundle /path/to/bundle \
--reruns 3 \
--output-json /path/to/bundle/submission_review.json \
--output-md /path/to/bundle/submission_review.md
Recommended for any external submission or reproducibility claim.
reference-vectors¶
python -m automechinterp_evaluator.cli reference-vectors
Use this to test behavioral compatibility when implementing third-party evaluators.
Command discovery¶
For authoritative flag details:
python -m automechinterp_evaluator.cli --help
python -m automechinterp_evaluator.cli <command> --help