Leveling Up AI: Reinforcement Learning with Human Feedback (Ep. 222)

April 4, 2023

In this episode, we dive into the not-so-secret sauce of ChatGPT, and what makes it a different model than its predecessors in the field of NLP and Large Language Models.

We explore how human feedback can be used to speed up the learning process in reinforcement learning, making it more efficient and effective.

Whether you’re a machine learning practitioner, researcher, or simply curious about how machines learn, this episode will give you a fascinating glimpse into the world of reinforcement learning with human feedback.

References

Learning through human feedback

https://www.deepmind.com/blog/learning-through-human-feedback

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2204.05862

Leveling Up AI: Reinforcement Learning with Human Feedback (Ep. 222)

Sponsors

References

Related posts

EU AI Act. What is this thing? (Part 1) (Ep. 310)

The propaganda algorithm (Ep. 308)

AI is the Concorde of our time (Ep. 309)