CLI Reference¶

This page summarizes the primary evaluator CLI commands and how they are typically used together.

Command map¶

Command	Purpose	Typical output
`init-template`	scaffold minimal valid bundle	bundle directory
`evaluate`	compute machine-readable gate/tier outputs	`result.json`
`report`	render markdown summary report	`stage_gate_report.md`
`submission-review`	run deterministic reruns + workflow guidance	`submission_review.json`, `submission_review.md`
`reference-vectors`	emit compatibility vectors for cross-impl checks	reference vector files

`evaluate`¶

python -m automechinterp_evaluator.cli evaluate \
  --bundle /path/to/bundle \
  --output /path/to/bundle/result.json

Use this as the canonical machine-readable record for downstream tooling.

`report`¶

python -m automechinterp_evaluator.cli report \
  --bundle /path/to/bundle \
  --output /path/to/bundle/stage_gate_report.md

Use this for human review, writeups, and triage meetings.

`submission-review`¶

python -m automechinterp_evaluator.cli submission-review \
  --bundle /path/to/bundle \
  --reruns 3 \
  --output-json /path/to/bundle/submission_review.json \
  --output-md /path/to/bundle/submission_review.md

Recommended for any external submission or reproducibility claim.

`reference-vectors`¶

python -m automechinterp_evaluator.cli reference-vectors

Use this to test behavioral compatibility when implementing third-party evaluators.

Command discovery¶

For authoritative flag details:

python -m automechinterp_evaluator.cli --help
python -m automechinterp_evaluator.cli <command> --help

CLI Reference¶

Command map¶

evaluate¶

report¶

submission-review¶

reference-vectors¶

Command discovery¶

`evaluate`¶

`report`¶

`submission-review`¶

`reference-vectors`¶