Leaderboard
Official benchmark results. Scores are updated when participants run the CLI evaluator and submit JSON outputs.
Overall = 0.50 × Documentation + 0.50 × Understanding.
Current Results
Loading leaderboard...
No submissions available yet.
Unable to load leaderboard. Ensure
results/leaderboard.json exists or re-run legacycodebench evaluate.
| Rank | Model | Version | Overall (%) | Documentation (%) | Understanding (%) | Tasks Solved | Date | Paper |
|---|
Submission Workflow
- Run models via CLI (
legacycodebench run-ai --model gpt-4o). - Generate JSON results (
legacycodebench evaluate). - Open a pull request adding your result JSON to the repository.
- Maintainers review and update this leaderboard.