LLMs MasterClass: Last Day for Early-Bird Price
Train, Fine-Tune and Deploy Large Language Models
Today is the last day to get early bird pricing (25%) for the Train, Fine-Tune, and Deploy Large Language Models Masterclass! Additionally, members of the AiEdge Newsletter get an additional 20% discount by applying the following coupon: NEWSLETTER. Make sure to register before it is too late:
You can get an additional 20% off if you enroll in the ML fundamentals BootCamp as well:
The BootCamp will start on August 15th, 2024. This is going to be 6 weeks of intense hands-on learning to become an expert in building LLM applications.
What you are going to build in the projects (May be subject to changes)
This bootcamp is hands-on, which means that it is meant to prepare you for the job. It is going to be more practical than theoretical, and we are going to implement the following in the projects:
Project 1: Implementing from scratch The sparse attention mechanisms, SliGLU, RMSNorm, MoE, and Rope embedding in PyTorch
Project 2: Fine-tuning an LLM with PPO vs DPO vs ORPO using the PEFT package.
Project 3: Train an LLM in a distributed manner with the Accelerate package in AWS SageMaker with the Zero Redundancy Optimizer Strategy.
Project 4: Fine-tuning a model with QLoRA to increase the context size.
Project 5: Deploying a scalable LLM application API with streaming, KV-caching, Continuous batching, and text generation layer capabilities.
Project 6: Deploying an RAG application using LangChain, FastAPI, and LangServe.
Who is this Bootcamp for?
This Bootcamp is meant for Engineers with some experience in Data Science or Machine Learning Engineering who want to upgrade their skills in Large Language Modeling. This Bootcamp is not meant to be easy, but I can promise you that your understanding of LLMs will be at a completely different level!
Prerequisites
Prior experience or knowledge of Machine Learning - at least 6 months. I expect people to feel comfortable with the concepts developed in the Machine Learning Fundamental Bootcamp.
Proficiency in Python - at least 1 year experience.
What is included!
40+ hours of recorded lectures
6 hands-on projects
Project support
Certification upon graduation
Access to our online community
Lifetime access to course content
The Schedule
We are going to meet every Thursday and Friday between 9 am and 12 pm PST starting August 15th.
6 Weeks of Intense Learning!
The Transformer Architecture (1 week)
The Transformer is the fundamental Neural Network architecture that enabled the evolution of Large Language Models as we know them now.
The Self-Attention Mechanism
The Multihead attention
The encoder-decoder architecture
The position embedding
The layer-normalization
The position-wise feed-forward network
The cross-attention layer
The language modeling head
Training LLMs to Follow Instruction (1 week)
GhatGPT, Claude, or Gemini are LLMs trained to follow human instructions. We are going to learn how those are trained from scratch:
The Causal Language Modeling Pretraining Step
The Supervised Learning Fine-Tuning Step
The Reinforcement Learning Fine-Tuning Step
Implementing those Steps with HuggingFace
How to Scale Model Training (1 week)
More than ever, we need efficient hardware to accelerate the training process. We are going to explore the strategy of distributing training computations across multiple GPUs for different parallelism strategies:
CPU vs GPU vs TPU
The GPU Architecture
Distributed Training
Data Parallelism
Model Parallelism
Zero Redundancy Optimizer Strategy
How to Fine-Tune LLMs (1 week)
Fine-tuning a model means we continue the training on a specialized dataset for a specialized learning task. We are going to look at the different strategies to fine-tune LLMs:
The different fine-tuning learning tasks
Catastrophic forgetting
LoRA Adapters
QLoRA
How to Deploy LLMs (1 week)
The most important part of a machine learning model development is the deployment! A model that is not in production is a model that is costing money instead of generating money for the company. We are going to explore the different strategies to deploy LLMs:
The Deployment Strategies
Multi-LoRA
The Text Generation Layer
Streaming Applications
Continuous Batching
KV-Caching
The Paged-Attention and vLLM
Building the Application Layer (1 week)
A deployed LLM on its own is not really useful. We are going to look at how we can build an agentic application on top of the model with LangChain:
Implementing a Retriever Augmented Generation (RAG) pipeline with LangChain
Optimizing the RAG pipeline
Serving the pipeline with LangServe and FastAPI
Don’t hesitate to send me an email if you have more questions: damienb@theaiedge.io.