Back to news
AI Research
Apr 26, 2026

SWE-bench Verified Discontinues Measurement of Advanced Coding Skills

Apr 26, 2026
AI Summary

SWE-bench Verified has ceased its evaluation of frontier coding capabilities due to limitations in accurately assessing complex programming skills. This decision reflects a shift in focus towards more effective methods of evaluating software engineering proficiency.

  • SWE-bench Verified is no longer measuring advanced coding capabilities.
  • The decision was made because the tool could not accurately assess complex programming tasks.
  • The change indicates a shift towards finding better ways to evaluate software engineering skills.
swe-benchcoding capabilitiesevaluationsoftware engineeringresearch