HN NewShowAskJobsBuilt with Paraglide and Solid

Agent-evals: Metacognitive scoring and boundary testing for LLM coding agents

(thinkwright.ai)

2 points | by oceanwaves 13 hours ago

1 comments

13 hours ago
[deleted]