3 points | by Anon84 3 days ago
1 comments
Interesting twist on automated curriculum learning. This paper is using an LLM for the environment and the policy. Other papers use LLMs for policy/value fn. Would be cool to see other reward strategies tying all these threads together
Interesting twist on automated curriculum learning. This paper is using an LLM for the environment and the policy. Other papers use LLMs for policy/value fn. Would be cool to see other reward strategies tying all these threads together