Leaderboard

Official benchmark results. Scores are updated when participants run the CLI evaluator and submit JSON outputs.

Overall = 0.50 × Documentation + 0.50 × Understanding.

Current Results

Loading leaderboard...

Submission Workflow

  1. Run models via CLI (legacycodebench run-ai --model gpt-4o).
  2. Generate JSON results (legacycodebench evaluate).
  3. Open a pull request adding your result JSON to the repository.
  4. Maintainers review and update this leaderboard.