Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets

(codeclash.ai)

3 points | by lieret 8 hours ago

1 comments