30 Free Machine Learning E-Books!
Today, I have aggregated 30 e-books about Machine Learning that can be found online. Enjoy!
Deep Learning
Ian Goodfellow Yoshua Bengio, and Aaron Courville
The Deep Learning textbook is a resource intended to help students and practitioners enter the field of general machine learning and deep learning. The online version of the book is now complete and will remain available online for free.
Mathematics for Machine Learning
Marc Peter Deisenroth, A. Aldo Faisal, Cheng Soon Ong
The fundamental mathematical tools needed to understand machine learning include linear algebra, analytic geometry, matrix decompositions, vector calculus, optimization, probability, and statistics. These topics are traditionally taught in disparate courses, making it hard for data science or computer science students or professionals to learn mathematics efficiently. This self contained textbook bridges the gap between mathematical and machine learning texts, introducing the mathematical concepts with a minimum of prerequisites.
An Introduction to Statistical Learning
Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor
An Introduction to Statistical Learning provides a broad and less technical treatment of key topics in statistical learning. This book is appropriate for anyone who wishes to use contemporary tools for data analysis.
The Elements of Statistical Learning
Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie
This book describes the important ideas in a variety of fields, such as medicine, biology, finance, and marketing, in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with liberal use of color graphics.
Probabilistic Machine Learning: An Introduction
Kevin Patrick Murphy
This book offers a detailed and up-to-date introduction to machine learning (including deep learning) through the unifying lens of probabilistic modeling and Bayesian decision theory.
Probabilistic Machine Learning: Advanced Topics
Kevin Patrick Murphy
An advanced counterpart to Probabilistic Machine Learning: An Introduction, this high-level textbook provides researchers and graduate students detailed coverage of cutting-edge topics in machine learning, including deep generative modeling, graphical models, Bayesian inference, reinforcement learning, and causality.
Understanding Machine Learning: From Theory to Algorithms
Shai Shalev-Shwartz and Shai Ben-David
The aim of this textbook is to introduce machine learning and the algorithmic paradigms it offers in a principled way. The book provides an extensive theoretical account of the fundamental ideas underlying machine learning and the mathematical derivations that transform these principles into practical algorithms.
Automated Machine Learning: Methods, Systems, Challenges
Frank Hutter • Lars Kotthoff • Joaquin Vanschoren
This open-access book presents the first comprehensive overview of general methods in Automated Machine Learning (AutoML), collects descriptions of existing systems based on these methods, and discusses the first series of international challenges of AutoML systems.
Applied Causal Inference
Uday Kamath, Kenneth Graham, Mitchell Naylor
This book is designed to help anyone along the spectrum of experience with causal inference – nearly everyone, from absolute beginners to experienced users of causality, will be able to learn something about the world of causal inference and how it can be applied.
Reinforcement Learning: An Introduction
Richard S. Sutton and Andrew G. Barto
In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the field's key ideas and algorithms. This second edition has been significantly expanded and updated, presenting new topics and updating coverage of other topics.
The Hundred-Page Machine Learning Book
Andriy Burkov
Burkov has undertaken a very useful but impossibly hard task in reducing all of machine learning to 100 pages. He succeeds well in choosing the topics — both theory and practice — that will be useful to practitioners, and for the reader who understands that this is the first 100 (or actually 150) pages you will read, not the last, provides a solid introduction to the field.
Machine Learning Engineering
Andriy Burkov
This new book by Andriy Burkov is the most complete applied AI book out there. It is filled with best practices and design patterns for building reliable machine learning solutions that scale.
Natural Language Processing with Python
Steven Bird, Ewan Klein, and Edward Loper
This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation.
Dive into Deep Learning
Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola
This book is a comprehensive resource that makes deep learning approachable while still providing sufficient technical depth to enable engineers, scientists, and students to use deep learning in their own work.
Machine Learning Yearning
Andrew NG
The book has been divided into 13 parts originally by Prof. Andrew NG, along with the complete book with all the parts consolidated. In this book, you will learn how to align on ML strategies in a team setting, as well as how to set up development (dev) sets and test sets.
Machine Learning for Humans
Vishal Maini, Samer Sabri
This guide is intended to be accessible to anyone. Basic concepts in probability, statistics, programming, linear algebra, and calculus will be discussed, but it isn’t necessary to have prior knowledge of them to gain value from this series.
Pattern Recognition and Machine Learning
Christopher M. Bishop
This is the first textbook on pattern recognition to present the Bayesian viewpoint. The book presents approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. It uses graphical models to describe probability distributions when no other books apply graphical models to machine learning.
Deep Learning on Graphs
Yao Ma and Jiliang Tang
This book covers comprehensive content on developing deep learning techniques for graph-structured data with a specific focus on Graph Neural Networks (GNNs). The foundation of the GNN models is introduced in detail, including the two main building operations: graph filtering and pooling operations.
Approaching (Almost) Any Machine Learning Problem
Abhishek Thakur
This book is for people who have some theoretical knowledge of machine learning and deep learning and want to dive into applied machine learning. The book doesn't explain the algorithms but is more oriented towards how and what should you use to solve machine learning and deep learning problems.
Feature Engineering and Selection: A Practical Approach for Predictive Models
Max Kuhn and Kjell Johnson
This book describes techniques for finding the best representations of predictors for modeling and for finding the best subset of predictors for improving model performance.
Hands-On Machine Learning with R
Bradley Boehmke & Brandon Greenwell
Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today’s most popular machine learning methods. This book serves as a practitioner’s guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R
Deep Learning Interviews: Hundreds of fully solved job interview questions from a wide range of key topics in AI
Shlomo Kashani and Amir Ivry
It is designed to both rehearse interview or exam-specific topics and provide machine learning M.Sc./Ph.D. students and those awaiting an interview with a well-organized overview of the field.
An Introduction to Machine Learning Interpretability
Patrick Hall and Navdeep Gill
To help practitioners make the most of recent and disruptive break‐ throughs in debugging, explainability, fairness, and interpretability techniques for machine learning, this report defines key terms, introduces the human and commercial motivations for the techniques, and discusses predictive modeling and machine learning from an applied perspective, focusing on the common challenges of business adoption, internal model documentation, governance, validation requirements, and external regulatory mandates.
Interpretable Machine Learning: A Guide For Making Black Box Models Explainable
Christoph Molnar
This book covers a range of interpretability methods, from inherently interpretable models to methods that can make any model interpretable, such as SHAP, LIME and permutation feature importance. It also includes interpretation methods specific to deep neural networks and discusses why interpretability is important in machine learning.
Boosting: Foundations and Algorithms
Robert E. Schapire, Yoav Freund
An accessible introduction and essential reference for an approach to machine learning that creates highly accurate prediction rules by combining many weak and inaccurate ones.
A Brief Introduction to Machine Learning for Engineers
Osvaldo Simeone
This presents the problem of where the engineer should start. The answer is often "for a general, but slightly outdated introduction, read this book; for a detailed survey of methods based on probabilistic models, check this reference; to learn about statistical learning, this text is useful" and so on.
Speech and Language Processing
Daniel Jurafsky & James Martin
For undergraduate or advanced undergraduate courses in Classical Natural Language Processing, Statistical Natural Language Processing, Speech Recognition, Computational Linguistics, and Human Language Processing.
Computer Vision: Models, Learning, and Inference
Simon J.D. Prince
This modern treatment of computer vision focuses on learning and inference in probabilistic models as a unifying theme. It shows how to use training data to learn the relationships between the observed image data and the aspects of the world that we wish to estimate, such as the 3D structure or the object class, and how to exploit these relationships to make new inferences about the world from new image data.
Information Theory, Inference and Learning Algorithms
David J. C. MacKay
This textbook introduces theory in tandem with applications. Information theory is taught alongside practical communication systems, such as arithmetic coding for data compression and sparse-graph codes for error-correction.
Machine Learning For Dummies
Judith Hurwitz and Daniel Kirsch
Machine Learning For Dummies, IBM Limited Edition, gives you insights into what machine learning is all about and how it can impact the way you can weaponize data to gain unimaginable insights. Your data is only as good as what you do with it and how you manage it.

































While some might argue (myself included) that a guide to machine learning containing over 30 e-books sounds a wee bit excessive, it's still an intriguing resource for anyone looking to delve into the depths of this ever-expanding domain. After all, who can resist the temptation of hatching neural networks and unraveling the mysteries of data prediction? It's impossible to ignore the intriguing possibilities brought about by the power of machine learning. Cheers to knowledge enhancement.
Provocative question! Having spent at least 25 years studying RL, ever since my first real job at IBM Research, where I explored the use of methods like Q-learning from 1990–93 to train robots new tasks, I’ve watched the field through its various phases. In the early 1990s, when I got involved, it was restricted to a small handful of aficionados. I organized the first National Science Foundation workshop on RL, to which about 50–60 senior researchers were invited (in 1995).
Gradually, through the early part of the 2000s, the field gained popularity, but never seemed to become a mainstream research topic within ML. Then, wham! Deep Mind did its thingie with the combination of deep learning and RL, applied to a visually appealing domain of Atari video games, and (deep) RL’s popularity went through the roof. Now, it seems all the rage, and certainly, many employers are hiring (in the Bay Area, it’s an area sought after by some of the labs doing autonomous driving). Google paid half a billion Euros for Deep Mind (supposedly!), on the basis of their deep RL Atari demo. So, this looked like a real turning point, and RL came to life!
So, getting back to the question, is RL a “dead end”? In answering this provocative question, one has to clarify one’s point of view. Certainly, from the standpoint of the work going on in Deep Mind and other places on using deep RL to play games like Go or Chess, or train given an accurate simulator of the world for a self-driving car, RL is certainly poised to become well established technology, and its popularity is only going to increase. RL sessions at major AI and ML conferences are very well attended, and RL submissions are definitely increasing. In all these dimensions, RL is very much not at a “dead end”, in fact, its popularity is only increasing.
But, but, …. you knew there was a but coming there!
When you impose on RL the goal of “online learning in real time from the real world”, and not doing millions of simulation steps where agents can be killed thousands of times with no penalty, I fear RL is very much at a dead end. It is not clear to me that any extension of the au courant deep RL methods is going to lead to successes in the real world, in terms of a physical agent that can learn in real time with a small number of examples.
That is, if your goal is to build a model of how humans learn complex skills, such as driving, then RL to me is a very poor explanation of how such skills are acquired. One has to only look at the comparative results reported in the AAAI 2017 paper by Tsividis et al., comparing random humans on Amazon Turk with the best deep RL programs at Atari video games to see where deep RL simply flounders. Humans learn Atari video games, like Frostbite, about 1000x faster than the fastest deep RL methods.
A typical human learned Frostbite in 1 minute with a few hundred examples at most. DQN or other deep RL programs take days with millions of examples. It’s not even close, it’s like another galaxy in terms of the speed of learning differences. So, looking at this paper, I’d have to say I don’t see any way to capture such large differences with any incremental tweaking of deep RL methods, such as being reported annually in ICML or NIPS papers (of which I review a bunch each year, hoping against hope to see a new idea emerge, only to be disappointed!).
So, what’s to be done to “rescue RL”. I’m not sure there’s really a solution out there. I for one have stopped believing that we learn complex skills like driving by something that resembles “pure RL” (that is, from rewards alone). Humans learn to drive because they in fact “know” how to drive even before they even try to drive once. They’ve seen their parents, friends, lovers, Uber drivers, etc. drive many many times, and they’ve seen driving behavior in movies for thousands of hours. So, when they finally get behind the wheel, they instinctively “know” what driving means, but of course, they have never actually controlled a physical car before. So, there is that all important “last mile” of actual driving that needs to be learned.
But, since the driving program is largely already in place, built in by many thousands of hours of observation, not to mention active instruction by a driving teacher or an anxious parent, what needs to be “learned” are a few control parameters that tell the human brain how much to turn the wheel, or press the brake, and more importantly, where to look on the road etc. This is course not trivial, which is why humans take a few weeks to get comfortable behind a wheel, But, if you look at real hours of practice, humans learn to drive in a few hundred hours — for those paying for driving instruction, this is expensive since you are charged by the hour.
Also, all important to remember is that when you impose the condition of learning in the real world, there can be “no cheating”! That is, unlike the ridiculous 2D world of Atari video games, like Enduro, where one is given a 2D highly simplified visual world, and actions are limited to a few discrete choices, humans must drive in the full 3D real world and have the huge task of controlling both legs, both hands, neck, body, etc, many hundreds of continuous degrees of freedom, as well as have to cope with an immense sensory space of stereo vision, and binaural hearing as well.
The only way humans ever learn to drive in a few hundred hours is the simple fact that we already almost know driving, and we have obviously a fully working vision system, so we can read signs, recognize cars and pedestrians, and our hearing system also recognizes sirens, alerts, horns etc. So, if you look at the immensity of the whole driving task, I would claim more than 95% of the driving knowledge is already known, and the small remaining part has to be acquired from practice. This is the only explanation for how humans learn such a complex skill as driving in a few hundred hours. There is NO magic here.
So, in that sense, pure (deep) RL seems like a dead end. The pure (deep) RL problem formulation really does not hold much interest for me any more. What is needed in its place is a more complex model of how learning happens by combining observation, transfer learning, and many other types of behavior cloning from observed demonstration to the learner, and finally being able to take this knowledge, and then improve it with some actual trial and error RL.
One can generalize this to other modes of learning as well. The late Richard Feynman, who was arguably the most influential physicist after the 2nd world war, taught a classic introductory course at Caltech, which led to probably the best selling college textbook of all time, the Feynman Lectures on Physics (still being sold almost 60 years later, in the nth edition). When he looked at how students handled his problem sets, Feynman was ultimately disappointed. He realized that even the extremely bright students at Caltech could not “learn” physics, simply sitting in his class and absorbing his lectures. So, he ended his preface to the textbook with a disappointing conclusion, quoting Gibbons (which I had long ago memorized):
“The power of instruction is seldom of much efficacy, except in those happy dispositions where it is almost superfluous”.
I realized the wisdom of this saying after spending two decades or more teaching machine learning to graduate students at several institutions. It seems almost paradoxical, but what Gibbons is saying, and what Feynman and I both discovered is that learning from teaching only works when the learner “almost already knows” the subject.
But, this is precisely what the various theoretical formulations of ML predict must be the case, there is no “free lunch” in terms of being able to learn. Deep Mind’s DQN network takes millions and millions of steps to learn an apparently trivial task (to humans) like Frostbite, because initially DQN knows nothing. Humans, in contrast, learn Frostbite in < 1 minute because they have spent many many hours building the background needed to learn Frostbite so quickly (e.g, vision, hand eye coordination, general game playing strategies).
Unfortunately, the prevailing currents in the field, at venues like “NeurIPS” (NIPS) and ICML and AAAI conferences, tend to “glorify” knowledge-free learning, so you end up with hundreds, if not thousands, of (deep) RL papers, where agents take millions of time steps to learn apparently simple tasks. To me, this approach is ultimately a “dead end”, if your goal is to develop a computational model of how humans learn.