Steering interpretable language models with concept algebra

(guidelabs.ai)

33 points | by luulinh90s a day ago

3 comments