Katarzyna (Kasia) Kobalczyk
Katarzyna (Kasia) Kobalczyk
Home
Publications
Contact
Light
Dark
Automatic
reinforcement learning
The Synergy of LLMs & RL Unlocks Offline Learning of Generalizable Language-Conditioned Policies with Low-fidelity Data
We train LLM agents as Language-conditioned policies without requiring expensive labeled data or online experimentation. The framework leverages LLMs to enable the use of unlabeled datasets and improve generalization to unseen goals and states.
Thomas Pouplin
,
Kasia Kobalczyk
,
Hao Sun
,
Mihaela van der Schaar
May 1, 2025
openreview
arXiv
PDF
code