Happy Thanksgiving!
Happy Thanksgiving, everyone! I want to thank all of you readers for continuing to learn machine learning together! To celebrate this, I am offering a 40% discount on my two courses in the next 5 days:
Apply the coupon BLACKFRIDAY to any of your purchases! This includes lifetime access to all the upcoming live sessions!
Below are the details!
Machine Learning Fundamentals Bootcamp
Machine Learning System Design (1 week)
Machine Learning (ML) system design is crucial for building effective ML solutions. It involves structuring the entire ML project to align with specific goals, ensuring efficiency, scalability, and performance. Proper design is key for integrating data handling, model training, and deployment while addressing real-world complexities like data variability and scalability.
ML system design is possibly the most important skill for becoming a machine learning engineer. Being able to effectively architect a viable ML solution to bring value to customers is what makes the difference between success and failure! We are going to focus on the key aspects of managing and designing ML solutions.
Fundamental of Machine Learning (2 weeks)
If one wants to become an ML engineer, honing one's skills in traditional machine learning remains critical! Most ML projects start with non-deep learning models. To this day, XGBoost remains the most used and best-performing model on tabular data, but what makes it special?
There are literally thousands of supervised learning algorithms, so instead of focusing on outdated or underperforming models, we are going to dive into the inner workings of models like XGBoost, LightGBM, and CatBoost to understand why they took the crown when it comes to statistical learning.
Once we understand how the algorithms work, we are going to focus on how to automate the training of ML models. Every ML engineer should aim to automate the development, validation, and deployment of their ML models!
Deep Learning (2 weeks)
It is not possible nowadays to be an ML engineer without having an intimate understanding of Deep Learning techniques! Deep Learning revolutionized the fields of computer vision, natural language processing, recommender systems, and many others.
More than any other domain in ML, deep learning requires the mind of an architect. We are going to dissect the different foundational model units and how to use them to build large models, and we are going to learn how to architect the right loss function for the right learning problem.
We are going to dive into the specialized architectures and how we can use them for different applications.
Deep Reinforcement Learning (1 week)
For a long time, Reinforcement Learning was just an academic curiosity. Deep Learning completely changed that! It is now becoming one of the fastest emerging ML domains with applications such as autonomous driving and generative AI.
We are going to dive into the different strategies to train a model to make human-like decisions. We are going to focus on how Deep Reinforcement Learning can be used to fine-tune Large Language Models (LLMs).
Career Coaching (1 week)
Being good at interviewing has nothing to do with being on the job! We are going to look at the best strategies to apply for jobs, design powerful resumes, and how to prepare for interviews.
What you are going to build in the projects
This bootcamp is hands-on, which means that it is meant to prepare you for the job. It is going to be more practical than theoretical, and we are going to implement the following in the projects:
Project 1: Design a Machine Learning System
Project 2: Implement XGBoost from Scratch!
Project 3: Kaggle competition using AutoML
Project 4: Implement ResNet50 from Scratch!
Project 5: Classification and Detection with Convolutional Neural Networks
Project 6: Training a Policy Gradient Model
Train, Fine-Tune, and Deploy Large Language Models Bootcamp
The Transformer Architecture (1 week)
The Transformer is the fundamental Neural Network architecture that enabled the evolution of Large Language Models as we know them now.
The Self-Attention Mechanism
The Multihead attention
The encoder-decoder architecture
The position embedding
The layer-normalization
The position-wise feed-forward network
The cross-attention layer
The language modeling head
Training LLMs to Follow Instruction (1 week)
GhatGPT, Claude, or Gemini are LLMs trained to follow human instructions. We are going to learn how those are trained from scratch:
The Causal Language Modeling Pretraining Step
The Supervised Learning Fine-Tuning Step
The Reinforcement Learning Fine-Tuning Step
Implementing those Steps with HuggingFace
How to Scale Model Training (1 week)
More than ever, we need efficient hardware to accelerate the training process. We are going to explore the strategy of distributing training computations across multiple GPUs for different parallelism strategies:
CPU vs GPU vs TPU
The GPU Architecture
Distributed Training
Data Parallelism
Model Parallelism
Zero Redundancy Optimizer Strategy
How to Fine-Tune LLMs (1 week)
Fine-tuning a model means we continue the training on a specialized dataset for a specialized learning task. We are going to look at the different strategies to fine-tune LLMs:
The different fine-tuning learning tasks
Catastrophic forgetting
LoRA Adapters
QLoRA
How to Deploy LLMs (1 week)
The most important part of a machine learning model development is the deployment! A model that is not in production is a model that is costing money instead of generating money for the company. We are going to explore the different strategies to deploy LLMs:
The Deployment Strategies
Multi-LoRA
The Text Generation Layer
Streaming Applications
Continuous Batching
KV-Caching
The Paged-Attention and vLLM
Building the Application Layer (1 week)
A deployed LLM on its own is not really useful. We are going to look at how we can build an agentic application on top of the model with LangChain:
Implementing a Retriever Augmented Generation (RAG) pipeline with LangChain
Optimizing the RAG pipeline
Serving the pipeline with LangServe and FastAPI
What you are going to build in the projects
This bootcamp is hands-on, which means that it is meant to prepare you for the job. It is going to be more practical than theoretical, and we are going to implement the following in the projects:
Project 1: Implementing from scratch The sparse attention mechanisms, SliGLU, RMSNorm, MoE, and Rope embedding in PyTorch
Project 2: Fine-tuning an LLM with PPO vs DPO vs ORPO using the PEFT package.
Project 3: Train an LLM in a distributed manner with the Accelerate package in AWS SageMaker with the Zero Redundancy Optimizer Strategy.
Project 4: Fine-tuning a model with QLoRA to increase the context size.
Project 5: Deploying a scalable LLM application API with streaming, KV-caching, Continuous batching, and text generation layer capabilities.
Project 6: Deploying an RAG application using LangChain, FastAPI, and LangServe.
Enjoy!