30 Free Machine Learning E-Books!
Today, I have aggregated 30 e-books about Machine Learning that can be found online. Enjoy!
Deep Learning
Ian Goodfellow Yoshua Bengio, and Aaron Courville
The Deep Learning textbook is a resource intended to help students and practitioners enter the field of general machine learning and deep learning. The online version of the book is now complete and will remain available online for free.
Mathematics for Machine Learning
Marc Peter Deisenroth, A. Aldo Faisal, Cheng Soon Ong
The fundamental mathematical tools needed to understand machine learning include linear algebra, analytic geometry, matrix decompositions, vector calculus, optimization, probability, and statistics. These topics are traditionally taught in disparate courses, making it hard for data science or computer science students or professionals to learn mathematics efficiently. This self contained textbook bridges the gap between mathematical and machine learning texts, introducing the mathematical concepts with a minimum of prerequisites.
An Introduction to Statistical Learning
Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor
An Introduction to Statistical Learning provides a broad and less technical treatment of key topics in statistical learning. This book is appropriate for anyone who wishes to use contemporary tools for data analysis.
The Elements of Statistical Learning
Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie
This book describes the important ideas in a variety of fields, such as medicine, biology, finance, and marketing, in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with liberal use of color graphics.
Probabilistic Machine Learning: An Introduction
Kevin Patrick Murphy
This book offers a detailed and up-to-date introduction to machine learning (including deep learning) through the unifying lens of probabilistic modeling and Bayesian decision theory.
Probabilistic Machine Learning: Advanced Topics
Kevin Patrick Murphy
An advanced counterpart to Probabilistic Machine Learning: An Introduction, this high-level textbook provides researchers and graduate students detailed coverage of cutting-edge topics in machine learning, including deep generative modeling, graphical models, Bayesian inference, reinforcement learning, and causality.
Understanding Machine Learning: From Theory to Algorithms
Shai Shalev-Shwartz and Shai Ben-David
The aim of this textbook is to introduce machine learning and the algorithmic paradigms it offers in a principled way. The book provides an extensive theoretical account of the fundamental ideas underlying machine learning and the mathematical derivations that transform these principles into practical algorithms.
Automated Machine Learning: Methods, Systems, Challenges
Frank Hutter • Lars Kotthoff • Joaquin Vanschoren
This open-access book presents the first comprehensive overview of general methods in Automated Machine Learning (AutoML), collects descriptions of existing systems based on these methods, and discusses the first series of international challenges of AutoML systems.
Applied Causal Inference
Uday Kamath, Kenneth Graham, Mitchell Naylor
This book is designed to help anyone along the spectrum of experience with causal inference – nearly everyone, from absolute beginners to experienced users of causality, will be able to learn something about the world of causal inference and how it can be applied.
Reinforcement Learning: An Introduction
Richard S. Sutton and Andrew G. Barto
In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the field's key ideas and algorithms. This second edition has been significantly expanded and updated, presenting new topics and updating coverage of other topics.
The Hundred-Page Machine Learning Book
Andriy Burkov
Burkov has undertaken a very useful but impossibly hard task in reducing all of machine learning to 100 pages. He succeeds well in choosing the topics — both theory and practice — that will be useful to practitioners, and for the reader who understands that this is the first 100 (or actually 150) pages you will read, not the last, provides a solid introduction to the field.
Machine Learning Engineering
Andriy Burkov
This new book by Andriy Burkov is the most complete applied AI book out there. It is filled with best practices and design patterns for building reliable machine learning solutions that scale.
Natural Language Processing with Python
Steven Bird, Ewan Klein, and Edward Loper
This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation.
Dive into Deep Learning
Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola
This book is a comprehensive resource that makes deep learning approachable while still providing sufficient technical depth to enable engineers, scientists, and students to use deep learning in their own work.
Machine Learning Yearning
Andrew NG
The book has been divided into 13 parts originally by Prof. Andrew NG, along with the complete book with all the parts consolidated. In this book, you will learn how to align on ML strategies in a team setting, as well as how to set up development (dev) sets and test sets.
Machine Learning for Humans
Vishal Maini, Samer Sabri
This guide is intended to be accessible to anyone. Basic concepts in probability, statistics, programming, linear algebra, and calculus will be discussed, but it isn’t necessary to have prior knowledge of them to gain value from this series.
Pattern Recognition and Machine Learning
Christopher M. Bishop
This is the first textbook on pattern recognition to present the Bayesian viewpoint. The book presents approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. It uses graphical models to describe probability distributions when no other books apply graphical models to machine learning.
Deep Learning on Graphs
Yao Ma and Jiliang Tang
This book covers comprehensive content on developing deep learning techniques for graph-structured data with a specific focus on Graph Neural Networks (GNNs). The foundation of the GNN models is introduced in detail, including the two main building operations: graph filtering and pooling operations.
Approaching (Almost) Any Machine Learning Problem
Abhishek Thakur
This book is for people who have some theoretical knowledge of machine learning and deep learning and want to dive into applied machine learning. The book doesn't explain the algorithms but is more oriented towards how and what should you use to solve machine learning and deep learning problems.
Feature Engineering and Selection: A Practical Approach for Predictive Models
Max Kuhn and Kjell Johnson
This book describes techniques for finding the best representations of predictors for modeling and for finding the best subset of predictors for improving model performance.
Hands-On Machine Learning with R
Bradley Boehmke & Brandon Greenwell
Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today’s most popular machine learning methods. This book serves as a practitioner’s guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R
Deep Learning Interviews: Hundreds of fully solved job interview questions from a wide range of key topics in AI
Shlomo Kashani and Amir Ivry
It is designed to both rehearse interview or exam-specific topics and provide machine learning M.Sc./Ph.D. students and those awaiting an interview with a well-organized overview of the field.
An Introduction to Machine Learning Interpretability
Patrick Hall and Navdeep Gill
To help practitioners make the most of recent and disruptive break‐ throughs in debugging, explainability, fairness, and interpretability techniques for machine learning, this report defines key terms, introduces the human and commercial motivations for the techniques, and discusses predictive modeling and machine learning from an applied perspective, focusing on the common challenges of business adoption, internal model documentation, governance, validation requirements, and external regulatory mandates.
Interpretable Machine Learning: A Guide For Making Black Box Models Explainable
Christoph Molnar
This book covers a range of interpretability methods, from inherently interpretable models to methods that can make any model interpretable, such as SHAP, LIME and permutation feature importance. It also includes interpretation methods specific to deep neural networks and discusses why interpretability is important in machine learning.
Boosting: Foundations and Algorithms
Robert E. Schapire, Yoav Freund
An accessible introduction and essential reference for an approach to machine learning that creates highly accurate prediction rules by combining many weak and inaccurate ones.
A Brief Introduction to Machine Learning for Engineers
Osvaldo Simeone
This presents the problem of where the engineer should start. The answer is often "for a general, but slightly outdated introduction, read this book; for a detailed survey of methods based on probabilistic models, check this reference; to learn about statistical learning, this text is useful" and so on.
Speech and Language Processing
Daniel Jurafsky & James Martin
For undergraduate or advanced undergraduate courses in Classical Natural Language Processing, Statistical Natural Language Processing, Speech Recognition, Computational Linguistics, and Human Language Processing.
Computer Vision: Models, Learning, and Inference
Simon J.D. Prince
This modern treatment of computer vision focuses on learning and inference in probabilistic models as a unifying theme. It shows how to use training data to learn the relationships between the observed image data and the aspects of the world that we wish to estimate, such as the 3D structure or the object class, and how to exploit these relationships to make new inferences about the world from new image data.
Information Theory, Inference and Learning Algorithms
David J. C. MacKay
This textbook introduces theory in tandem with applications. Information theory is taught alongside practical communication systems, such as arithmetic coding for data compression and sparse-graph codes for error-correction.
Machine Learning For Dummies
Judith Hurwitz and Daniel Kirsch
Machine Learning For Dummies, IBM Limited Edition, gives you insights into what machine learning is all about and how it can impact the way you can weaponize data to gain unimaginable insights. Your data is only as good as what you do with it and how you manage it.

































While some might argue (myself included) that a guide to machine learning containing over 30 e-books sounds a wee bit excessive, it's still an intriguing resource for anyone looking to delve into the depths of this ever-expanding domain. After all, who can resist the temptation of hatching neural networks and unraveling the mysteries of data prediction? It's impossible to ignore the intriguing possibilities brought about by the power of machine learning. Cheers to knowledge enhancement.
Hidden Markov Models can be used to generate a language, that is, list elements from a family of strings. For example, if you have a HMM that models a set of sequences, you would be able to generate members of this family, by listing sequences that would fall into the group of sequences we are modelling.
Neural Networks, take an input from a high-dimensional space and simply map it to a lower dimensional space (the way that the Neural Networks map this input is based on the training, its topology and other factors). For example, you might take a 64-bit image of a number and map it to a true / false value that describes whether this number is 1 or 0.
Whilst both methods are able to (or can at least try to) discriminate whether an item is a member of a class or not, Neural Networks cannot generate a language as described above.
There are alternatives to Hidden Markov Models available, for example you might be able to use a more general Bayesian Network, a different topology or a Stochastic Context-Free Grammar (SCFG) if you believe that the problem lies within the HMMs lack of power to model your problem - that is, if you need an algorithm that is able to discriminate between more complex hypotheses and/or describe the behaviour of data that is much more complex.
What is hidden and what is observed: The thing that is hidden in a hidden Markov model is the same as the thing that is hidden in a discrete mixture model, so for clarity, forget about the hidden state's dynamics and stick with a finite mixture model as an example. The 'state' in this model is the identity of the component that caused each observation. In this class of model such causes are never observed, so 'hidden cause' is translated statistically into the claim that the observed data have marginal dependencies which are removed when the source component is known. And the source components are estimated to be whatever makes this statistical relationship true. The thing that is hidden in a feedforward multilayer neural network with sigmoid middle units is the states of those units, not the outputs which are the target of inference. When the output of the network is a classification, i.e., a probability distribution over possible output categories, these hidden units values define a space within which categories are separable. The trick in learning such a model is to make a hidden space (by adjusting the mapping out of the input units) within which the problem is linear. Consequently, non-linear decision boundaries are possible from the system as a whole.
Generative versus discriminative: The mixture model (and HMM) is a model of the data generating process, sometimes called a likelihood or 'forward model'. When coupled with some assumptions about the prior probabilities of each state you can infer a distribution over possible values of the hidden state using Bayes theorem (a generative approach). Note that, while called a 'prior', both the prior and the parameters in the likelihood are usually learned from data. In contrast to the mixture model (and HMM) the neural network learns a posterior distribution over the output categories directly (a discriminative approach). This is possible because the output values were observed during estimation. And since they were observed, it is not necessary to construct a posterior distribution from a prior and a specific model for the likelihood such as a mixture. The posterior is learnt directly from data, which is more efficient and less model dependent.
Mix and match: To make things more confusing, these approaches can be mixed together, e.g. when mixture model (or HMM) state is sometimes actually observed. When that is true, and in some other circumstances not relevant here, it is possible to train discriminatively in an otherwise generative model. Similarly it is possible to replace the mixture model mapping of an HMM with a more flexible forward model, e.g., a neural network.