HN
New
Show
Ask
Jobs
Built with Paraglide and Solid
en
pl
Agent-evals: Metacognitive scoring and boundary testing for LLM coding agents
(thinkwright.ai)
2 points | by
oceanwaves
13 hours ago
1 comments
13 hours ago
[deleted]
1 comments