Documentation
This site is static. All execution happens via the CLI, similar to SWE-bench. Use the commands below to run LegacyCodeBench locally.
Quick Links
Installation
git clone https://github.com/kalmantic/legacycodebench.git
cd legacycodebench
pip install -e .
legacycodebench --help
cd legacycodebench
pip install -e .
legacycodebench --help
Requires Python 3.10+, Git, and Docker (for verification).
Core CLI Commands
Load Datasets
legacycodebench load-datasets
Create Tasks
legacycodebench create-tasks
Run a Model
legacycodebench run-ai --model gpt-4o
Full Evaluation
legacycodebench evaluate
Interactive Mode
legacycodebench interactive
Reference Documentation Workflow
Reference docs are created by COBOL experts using CLI templates:
python scripts/create_reference_template.py \
--task-id LCB-DOC-001 \
--expert-id expert1
--task-id LCB-DOC-001 \
--expert-id expert1
Experts complete the template, inter-annotator agreement is measured, and merged references are stored under references/.
Troubleshooting
OpenAI quota or 5xx errors?
Wait a few minutes or use legacycodebench run-ai --model claude-sonnet-4. The CLI falls back to mock responses when APIs are unavailable.
Leaderboard not printing?
Run legacycodebench leaderboard --print. Ensure results/leaderboard.json exists (generated by evaluate).
Need new tasks?
Edit legacycodebench/task_generator.py and rerun create-tasks. Intelligent selection ensures PRD-aligned coverage.