Jump to ratings and reviews
Rate this book

Machine Learning With Random Forests And Decision Trees: A Visual Guide For Beginners

Rate this book
Machine Learning - Made Easy To UnderstandIf you are looking for a book to help you understand how the machine learning algorithms "Random Forest" and "Decision Trees" work behind the scenes, then this is a good book for you.  Those two algorithms are commonly used in a variety of applications including big data analysis for industry and data analysis competitions like you would find on Kaggle.

This book explains how Decision Trees work and how they can be combined into a Random Forest to reduce many of the common problems with decision trees, such as overfitting the training data.



Several Dozen Visual ExamplesEquations are great for really understanding every last detail of an algorithm.  But to get a basic idea of how something works, in a way that will stick with you 6 months later, nothing beats pictures.  This book contains several dozen images which detail things such as how a decision tree picks what splits it will make, how a decision tree can over fit its data, and how multiple decision trees can be combined to form a random forest.

This Is Not A TextbookMost books, and other information on machine learning, that I have seen fall into one of two categories, they are either textbooks that explain an algorithm in a way similar to "And then the algorithm optimizes this loss function" or they focus entirely on how to set up code to use the algorithm and how to tune the parameters.

This book takes a different approach that is based on providing simple examples of how Decision Trees and Random Forests work, and building on those examples step by step to encompass the more complicated parts of the algorithms.  The actual equations behind decision trees and random forests get explained by breaking them down and showing what each part of the equation does, and how it affects the examples in question.



Python Files & Excel File For Many Of The Examples Shown In The BookSome topics in machine learning don't lend themselves to equations in an Excel table.  Things like error checking or complicated conditionals are hard to replicate outside of code.  However some topics work quite well in a spreadsheet.  Topics such as entropy and information gain, which is how a decision tree picks its splits, can be easily calculated in a spreadsheet.  The spreadsheet used to generate many of the examples in this book is available for free download, as are all of the Python scripts that ran the Random Forests & Decision Trees in this book and generated many of the plots and images.  

If you are someone who learns by playing with the code, and editing the data or equations to see what changes, then use those resources along with the book for a deeper understanding.



Topics CoveredThe topics covered in this book are

An overview of decision trees and random forestsA manual example of how a human would classify a dataset, compared to how a decision tree would workHow a decision tree works, and why it is prone to overfittingHow decision trees get combined to form a random forestHow to use that random forest to classify data and make predictionsHow to determine how many trees to use in a random forestJu

82 pages, Kindle Edition

Published August 12, 2016

1064 people are currently reading
486 people want to read

About the author

Scott Hartshorn

18 books13 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
191 (34%)
4 stars
208 (38%)
3 stars
120 (21%)
2 stars
20 (3%)
1 star
7 (1%)
Displaying 1 - 30 of 47 reviews
Profile Image for William Anderson.
134 reviews25 followers
February 13, 2017
Clear explanations on a Small set of knowledge

This book feels more like it belongs as a chapter of another work or as a highly shared blogpost as opposed to a printed text.

That being said it was super clear and easy to read and the author understood when to explain in theory vs code.

If you want to get your feet wet with decision trees this is not a bad place to start but further reading will be required.
Profile Image for Pawin.
55 reviews2 followers
October 31, 2020
One of the greatest books I read about tree-based models. The book includes all important info, such as information theory, gini & entropy concepts, out of bag, in depth on how the model works, and many more.

Also this book provides example by using small dataset which helps readers to understand the underlying concepts quite well.

I would recommend for those who are working with tree-based model to read this book.
Profile Image for Shelly (YI-Hsuan) LIN.
26 reviews2 followers
November 3, 2020
This book is perfect for people who wants to know how Tree Based model works. It used easy and understandable way to explain the algorithms and how it is adopted. Definitely recommended it for anyone who wants to be able to say now I know how Tree Based model works!
Profile Image for Sundar.
43 reviews29 followers
August 3, 2017
A lucid read to learn the intuition and concepts behind Decision Trees and Random Forests. Machine learning concepts were familiar to me for sometime now, yet I never felt this confident about fiddling with the approaches, having spent most of my time in feature engineering. This was an extremely quick refresher which I could use immediately.

After reading Probability - A Beginner's Guide To Permutations And Combinations: The Classic Equations, Better Explained (my review), Bayes Theorem Examples: An Intuitive Guide (my review) and now this book, I've become a fan of Scott Hartshorn.
7 reviews
April 5, 2020
Very good conceptual book for beginners

Easy to read, and focuses on the core concepts. I’d say that if you have no background in statistics, this is approachable though you may have questions as to the mathematical principles (e.g., logarithms in Entropy, and why one uses that to calculate information gain as opposed to Gini).

One major improvement would be to add clearer definitions earlier on in the book for certain terms used repeatedly (node, leaf, branch, tree, feature, category). Some of these are supposed to be intuitive, though the author could be more disciplined in how he applies each of these terms.
Profile Image for Alex Ho.
1 review
January 3, 2019
I like how it was written - clear and concise with examples

I was reading content related to Random Forest that I found on the web but I couldn't really grasp what's happening underneath, until I found this book. It shows you the concept and mechanism on how to use RF, why it works and when not to used it. After reading it, I got a feeling that I can write the algorithm myself too! Good book to read!
Profile Image for Doug Crawford.
Author 5 books
May 23, 2017
Great intro

This is the first book I have read on this topic, and I found it very easy to understand. I have a background in more traditional data analysis such as multiple linear regression, control charts and cluster analysis. This book has given me a good start in extending my options for analyzing data.
Profile Image for Hani Rosidaini.
38 reviews5 followers
August 2, 2017
Very basic concept for those who want to learn about Machine Learning. Fortunately, the explanations are easy to understand, but if you already knew some of them, you must thought that the author could explore a bit more to add the joy. I gave 3 stars because it made me want to read his other more advanced books.
1 review
November 5, 2017
This is an outstanding primer for the layperson or beginning student. In just under an hour, this book will provide the reader a solid foundation for understanding how Random Forests and Decision Trees work. Scott provides you clear examples and access to resources (data and code) to practice the examples on your own.
Profile Image for Kanishk.
6 reviews
December 23, 2017
Concise and to-the-point

Being my first read on machine learning, I absolutely liked the concise and a very hands-on approach used by the author to keep the text engaging. Highly recommended if you are looking to learn about some machine learning algorithms and resources to get started.
60 reviews
December 26, 2019
Detailed book on random forest

It's a great book and provide lot of details on random forest algorithm, how it works, how gini works, how feature importance works and all. I recommend this book for anyone starting to use ML or one who is already using Random forest without knowing how it works.
Thanks
Profile Image for Dr Andy Robertshaw.
1 review
January 3, 2020
Very clear intro to a complicated concept

Would recommend this book to anyone interested in learning about Random Forests.
It is a clear intro, relating to human thought processes, and giving simple mathematical examples each time.
I feel it has increased my knowledge in the field greatly, and my next mission to to read the Gradient Boosting book by the same author!
1 review1 follower
June 18, 2017
Clear and concise

I read this book to quickly refresh on random forests for a data science course and found the examples to be simple and intuitive. Links and references are included throughout if you need more details.
Profile Image for Rhonda Hypolite.
16 reviews
September 23, 2018
I really enjoyed this book.

I was curious about machine learning, and decided to purchase. I found that the book was pitched at the right level. Some links to supporting material were also provided (always useful). Superb as a beginner's/introductory book.
Profile Image for Sandipan Chatterjee.
19 reviews2 followers
February 23, 2019
Good book...job well done!

The book has a nice flow. All the concepts have been properly explained. I feel its a very level zero kind of book. But i appreviate Scott for witong this bokk, because the fundamentals are way more important than the implementation.
Profile Image for Robert.
11 reviews
March 5, 2019
Great Intro

Coming from a recommendation from another author; I’m glad I picked this up as it gives clear explanations on the generation of decision trees and random forests. Will look into the author’s other offerings.
Profile Image for John.
12 reviews
July 1, 2019
Simple but clear understanding of decision trees.

Basic but energetic review of Implementing decision trees in a machine learning environment. Not too much statistics, not too deep, but relatively thorough.
Profile Image for Tim.
25 reviews
August 30, 2019
A very brief, but good introduction

It's no easy task to explain random forests with any detail in a few pages, but this was a solid attempt a little hard to absorb in some places, but very helpful.
Profile Image for Felipe Augusto.
8 reviews
January 2, 2020
Very intuitive guide, the author builds the knowledge about random forests without difficult words. For me, the main topics he covers are:
- How it works;
- How to measure the different built forests;
- How to understand better the generated models and data.
Profile Image for Nikhil Arora.
2 reviews
March 18, 2020
Good book for casual read

Good book for casual read
If you are totally new to machine learning this is good text and text only without mathematics

Few more python examples could have been added to make it more interesting for more reader
4 reviews
February 20, 2021
Simple language and to the point

Written in a way that makes it easy to understand. Examples are also rather straightforward. Book has a pace without unnecessary content that would make it too long. Just contains what is necessary to grasp the topic and nothing more.
9 reviews2 followers
June 3, 2022
Good Intro to Random Forest

This quick intro gives you all the elements to know how to apply Random Forest to your dataset and understand the results and use them in your decision making processes
Profile Image for Jim Lyons.
194 reviews23 followers
December 6, 2016
Enjoyed this little book! I will also enjoy having it in my library for some quick reference on random forests, or for some Python examples.
Profile Image for Jean-Paul.
11 reviews17 followers
September 26, 2017
I finished this book in one gym biking session. It was a great introduction to Random Forests/Decision Trees. I'll be picking up more of Scott's guides.
56 reviews1 follower
September 25, 2017
Lead me to have a better understanding of the subject and has likely lead to another avenue of research for a work project. Definitely appreciate.
145 reviews1 follower
December 15, 2017
Pretty good book for beginners to get an easy understanding of Random Forests and Decision Trees. Well illustrated examples in semi-tech and semi-layman terms.
Profile Image for Weizhao Wang.
1 review3 followers
December 27, 2017
Not a good book feankly

This book is so short and did not cover much
I will not recommend this book instead you should get those more complete book
Profile Image for Steven Sanderson.
7 reviews
April 1, 2018
Random Forest quickie

This was a great book on Random Forests and would probably make a great first lecture for a class, it is simple to understand so big kudos to Scott on this.
Displaying 1 - 30 of 47 reviews

Can't find what you're looking for?

Get help and learn more about the design.