SWE-rebench V2: A 20-Language Benchmark for AI Coding Agents

Nebius released SWE-rebench V2, a dataset of 32,000+ real software engineering tasks across 20 programming languages. It replaces the Python-only SWE-bench as the standard for training and evaluating AI coding agents.
software-engineering
artificial-intelligence
Author

Kabui, Charles

Published

2026-03-13

Keywords

swe-bench, coding-agents, reinforcement-learning, multilingual-benchmark, nebius