Skip to content
View in the app

A better way to browse. Learn more.

Benchmark Six Sigma Forum

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Code Generation & Programming Task Leaderboards

Featured Replies

These leaderboards rank models based on their ability to solve programming problems, complete code snippets, or write functions based on docstrings. They are essential for evaluating coders like CodeLLaMA, StarCoder, and GPT-4 Code Interpreter. Datasets include HumanEval, MBPP, and CodeContests.

Tools:

  • BigCode Leaderboard – Benchmarks open-source code models on multiple coding challenges including pass@k metrics.

  • EvalPlus Leaderboard – Focuses on code reasoning tasks, math solvers, and program synthesis using extended HumanEval+.

Create an account or sign in to comment

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.