Computer Configuration Bench Test

Anthropic reports that agent coding performance varies by several percentage points depending on hardware configuration, and the difference in benchmark scores between high ...

Agent coding benchmark tests such as SWE-bench and Terminal-Bench are widely used to compare the software engineering capabilities of state-of-the-art AI models. The top positions on these benchmark ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results

Anthropic reports that agent coding performance varies by several percentage points depending on hardware configuration, and the difference in benchmark scores between high ...

Trending now