Build a Reasoning Model (From Scratch)
Sebastian Raschka
Description A deep dive into the architecture and implementation of AI models capable of logical deduction and multi-step reasoning. It explains how t...
Reinforcement learning from human feedback, alignment, and post-training LLMs
Nathan Lambert
Publisher
Manning
Published
2026
Duration
6 hr 15 min
ISBN
9781633434301
Genres
This is a guide to reinforcement learning from human feedback (RLHF), alignment, and post-training for Large Language Models (LLMs). Author Nathan Lambert blends perspectives from fields like philosophy and economics with the core mathematics and computer science of RLHF to provide a practical guide for applying RLHF to models. Reinforcement Learning From Human Feedback (RLHF) is the process of using human responses to a model’s output to shape its alignment and behavior. This book explores the ideas, established techniques, and best practices of the field, beginning with an overview of leading papers before detailing training and optimization tools. Content Overview Foundations: How advanced AI models are taught from human feedback and how large-scale preference data is collected. Core Methods: Derivations and implementations for policy-gradient methods used to train AI models with reinforcement learning (RL). Optimization: Direct Preference Optimization (DPO), direct alignment algorithms, and methods for preference finetuning. Modern Developments: The transition to reinforcement learning from verifiable rewards (RLVR) and industry techniques for character, personality, and AI feedback training. Evaluation: How to approach evaluation and how the field's standards have changed over time. Implementation: Standard recipes for post-training combining instruction tuning with RLHF, and the development history of open models like Llama-Instruct, Zephyr, Olmo, and Tülu. Advanced Techniques The book details optimization tools such as reward models, regularization, and instruction tuning, alongside advanced techniques including constitutional AI, synthetic data, and current open questions in the field. It examines modern RLHF training pipelines and their trade-offs through hands-on experiments and mini-implementations. About the Reader This book is intended for engineers and AI scientists looking to get started in AI training and students seeking a foothold in the industry. About the Author Nathan Lambert is the post-training lead at the Allen Institute for AI, having previously worked for HuggingFace, Deepmind, and Facebook AI. He has guest lectured at Stanford, Harvard, and MIT, and is a presenter at NeurIPS and other AI conferences. He has received the “Best Theme Paper Award” at ACL and “Geekwire Innovation of the Year.” Nathan earned a PhD in Electrical Engineering and Computer Science from University of California, Berkeley.