Blanket
About
Posts
Tags
en
Reinforcement Learning From Human Feedback
Training Language Models to Follow Instructions With Human Feedback (2022)
Deep Reinforcement Learing from Human Preferences (2017)