1 points | by WASDAai a day ago
1 comments
TL;DR: The paper shows how you can steer LLMs by messing with prompts, hidden states, or weight edits—and warns that the same tricks can be used maliciously.
TL;DR: The paper shows how you can steer LLMs by messing with prompts, hidden states, or weight edits—and warns that the same tricks can be used maliciously.