In Build a Large Language Model (from Scratch), you’ll discover how LLMs work from the inside out. In this insightful book, bestselling author Sebastian Raschka guides you step by step through creating your own LLM, explaining each stage with clear text, diagrams, and examples. You’ll go from the initial design and creation to pretraining on a general corpus, all the way to finetuning for specific tasks.
Build a Large Language Model (from Scratch) teaches you how to:
Plan and code all the parts of an LLM
Prepare a dataset suitable for LLM training
Finetune LLMs for text classification and with your own data
Apply instruction tuning techniques to ensure your LLM follows instructions
Load pretrained weights into an LLM
The large language models (LLMs) that power cutting-edge AI tools like ChatGPT, Bard, and Copilot seem like a miracle, but they’re not magic. This book demystifies LLMs by helping you build your own from scratch. You’ll get a unique and valuable insight into how LLMs work, learn how to evaluate their quality, and pick up concrete techniques to finetune and improve them.
The process you use to train and develop your own small-but-functional model in this book follows the same steps used to deliver huge-scale foundation models like GPT-4. Your small-scale LLM can be developed on an ordinary laptop, and you’ll be able to use it as your own personal assistant.
在《从零构建大型语言模型》一书中,你将深入探索大型语言模型的运作机制。作者塞巴斯蒂安·拉斯卡(Sebastian
Raschka)以富有洞察力的方式,带你逐步打造属于自己的大型语言模型,通过条理清晰的文字、直观的图表及实例,详尽阐释每个构建环节。从初步的设计与构建,到在通用语料库上的预训练,再到针对特定任务的精细调整,本书将引导你完成全过程。你将学会:
·规划并编码大型语言模型的所有组成部分
·准备适宜大型语言模型训练的数据集
·利用个人数据对大型语言模型进行文本分类微调
·应用指令调整技术,确保大型语言模型遵循指令
·载入预训练权重至大型语言模型中
ChatGPT、Bard和Copilot等AI工具背后的大型语言模型看似高深莫测,实则基于科学原理与技术实现。本书通过指导你从头构建大型语言模型,揭开了其神秘面纱。你将获得对大型语言模型工作原理的认识,学会评估其性能,并掌握具体技术来微调和优化它们。书中用于训练和开发个人小型但功能完备模型的过程,遵循了与打造GPT-4等大型基础模型相同的步骤。即便使用普通笔记本电脑,你也能开发出小型大型语言模型,让它成为你的私人智能助手。
作者介绍
塞巴斯蒂安·拉斯卡(Sebastian Raschka)从事机器学习和人工智能研究已有十多年。作者于 2022 年加入 Lightning AI,目前专注于人工智能和法学硕士研究、开发开源软件和创作教育材料。在此之前,塞巴斯蒂安在威斯康星大学麦迪逊分校担任统计系助理教授,专注于深度学习和机器学习研究。他对教育充满热情,以使用开源软件进行机器学习而闻名。