News
Newest
Ask
Show
Jobs
Open on GitHub
SWE-bench will hit 90% this year
(fabraix.com)
6 points | by
asfsf23423
4 hours ago
1 comments
upmind
1 hour ago
Maybe unpopular opinion but I think at this point SWE-Bench has done its part and we need a new benchmark because Gemini being on/near the same level as Claude is obviously wrong
[-]
amazingamazing
1 hour ago
I use both and think they’re comparable. AMA.
lern_too_spel
44 minutes ago
Gemini at the same level as Claude is believable. Gemini CLI is not at the same level as Claude Code.
1 comments