Rate this book

Mastering Machine Learning with Scikit-Learn

Name: Mastering Machine Learning with Scikit-Learn
Rating: 4 (8 reviews)
ISBN: 9781788298490

Gavin Hackeling

Rate this book

Use scikit-learn to apply machine learning to real-world problems About This Book - Master popular machine learning models including k-nearest neighbors, random forests, logistic regression, k-means, naive Bayes, and artificial neural networks - Learn how to build and evaluate performance of efficient models using scikit-learn - Practical guide to master your basics and learn from real life applications of machine learning Who This Book Is For This book is intended for software engineers who want to understand how common machine learning algorithms work and develop an intuition for how to use them, and for data scientists who want to learn about the scikit-learn API. Familiarity with machine learning fundamentals and Python are helpful, but not required. What You Will Learn - Review fundamental concepts such as bias and variance - Extract features from categorical variables, text, and images - Predict the values of continuous variables using linear regression and K Nearest Neighbors - Classify documents and images using logistic regression and support vector machines - Create ensembles of estimators using bagging and boosting techniques - Discover hidden structures in data using K-Means clustering - Evaluate the performance of machine learning systems in common tasks In Detail Machine learning is the buzzword bringing computer science and statistics together to build smart and efficient models. Using powerful algorithms and techniques offered by machine learning you can automate any analytical model. This book examines a variety of machine learning models including popular machine learning algorithms such as k-nearest neighbors, logistic regression, naive Bayes, k-means, decision trees, and artificial neural networks. It discusses data preprocessing, hyperparameter optimization, and ensemble methods. You will build systems that classify documents, recognize images, detect ads, and more. You will learn to use scikit-learn's API to extract features from categorical variables, text and images; evaluate model performance, and develop an intuition for how to improve your model's performance. By the end of this book, you will master all required concepts of scikit-learn to build efficient models at work to carry out advanced tasks with the practical approach. Style and approach This book is motivated by the belief that you do not understand something until you can describe it simply. Work through toy problems to develop your understanding of the learning algorithms and models, then apply your learnings to real-life problems.

GenresProgrammingArtificial IntelligenceTechnologyComputersTechnical

254 pages, ebook

First published November 10, 2014

22 people are currently reading

56 people want to read

About the author

Gavin Hackeling

2 books

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

10 (22%)

4 stars

25 (55%)

3 stars

10 (22%)

2 stars

0 (0%)

1 star

0 (0%)

Displaying 1 - 8 of 8 reviews

Jeff_abrahamson

1 review

December 14, 2014

The book is a reasonable soft introduction to machine learning concepts for practitioners whose goal is to understand just enough of the theory to use the tools. The explanations are sometimes imprecise analogies, but usually seem to communicate about the right intuition. The book could do with some exercises to guide new learners but is otherwise a quite good introduction.

That said, did anyone proof-read the mathematics? For example, on page 40 of the pdf, a missing absolute value suggests that L1 norms can be negative. At another point he explains that only square matrices are invertible (true) and so we multiply by the transpose in order to get a square matrix that we can invert. Mechanically correct, but otherwise wrong: just because we have a square matrix doesn't make it meaningful to invert it.

The text is rife with this sort of almost-correct stuff. "Occam's razor states that a hypothesis with the fewest assumptions is the best". Actually, it states that among competing hypotheses and in the absence of certainty, one should tentatively prefer the hypothesis with fewest assumptions. until we have more data. "Hyperparameters are parameters of the model that are not learned automatically and must be set manually". Actually, hyperparameters are parameters of the prior distribution rather than parameters of the model.

A few points are differently perplexing. Why in 2014 does he use python 2.7 instead of python 3.x to illustrate his examples? Why does he not even mention ipython? Why does he use manifestly bad variables names (for example, "xx")?

Some of my biggest niggles concerned the EPUB formatting. Reading on my Nexus 7, the mathematics scaled differently (smaller) than the rest of the text, even rendering unreadably small for inline symbols. Many of the figures were also too small to read without enlargement. The python code itself wrapped awkwardly. Numbers are often left-justified where convention would right-justify or justify on the decimal point. The book was almost unreadable for me in epub. The pdf was fine on a 10" tablet but not on a 7" tablet. Otherwise, stick with paper.

Overall, the book is a decent enough introduction to machine learning concepts for those who just need to use its techniques and who will have plenty of opportunity to test their results. This last is important, and largely absent in the book: machine learning techniques are not use and forget. We try things, we measure, and then we try some more. And we keep measuring, because the underlying data may change in subtle ways over time. Whatever theoretical mistakes result from a high-level-only view become even more important to measure.

Okeyo Mayaka

1 review

November 28, 2018

Excellent book overall. I read this book as a complete beginner to Machine Learning, and I was more than satisfied with the content therein. I liked the structuring of the chapters where a model[s] would be discussed then after training, the emphasis is shifted to performance metrics to use to evaluate the model[s]. I found this informative since most books I'd read to this point only discussed the model, training the model, and nothing more. I would recommend this for a beginner in Machine Learning.

Frank

36 reviews2 followers

September 18, 2018

Rather mediocre book.

Moustafa

14 reviews7 followers

July 15, 2015

This book presents a very gentle introduction to machine learning in python using scikit-learn. It differs a broad range of machine learning methods (linear regression, logistic regression, svm, clustering, neural networks, etc.) with real examples and Complete Code to run those tasks by your self. This is very certainly an effective way to teach to use the scikit-learn framework.

On the other hand, I have some comments about how the book could have been better:
1. It assumes a good knowledge of python and numpy (which is used extensively by scikit-learn implementation), adding an introduction chapter (or perhaps an appendix) with quick tutorials about both would have made the meal more complete.

2. This is book is NOT machine learning book !!, it provides quick overview about the methods but it focuses more on the scikit-learn API and the code examples. If you are eager to learn more about machine learning then you still need to read a real machine learning book to understand the theory and limitations and of the various methods.

3. Some minor typos exist :)

My overall recommendation is that the book achieves it goal in providing quick hands-on experiments of machine with sklearn to those who look up to !

Phil Moyer

24 reviews3 followers

January 4, 2016

Another excellent book from this series, it deals exclusively with the scikit-learn library in Python. This is more advanced than Python Machine Learning, but it is a very good book. It does not delve too deeply into the mathematics of machine learning systems, so it is much more "applied science" than, for example, Machine Learning: A Probabilistic Perspective, which is extremely technical and mathematical (or rather, statistical) in nature. You'll learn how to build machine learning systems from this book, while you'll learn (or be seriously challenged to learn) the back theory in Murphy's book. If you want to build ML systems In Real Life, this is one of the books to grab.

Franck Chauvel

119 reviews5 followers

December 14, 2016

As the title suggests, this is about machine learning algorithm using Python scikit-learn library.

As for algorithms, the content is very similar to other machine learning books such as Machine Learning for Hackers, but also includes decision trees and random forests. I find the text really easy to read, and I appreciated the effort made to convey the intuition beyond the formulas.

As for Python code examples, although some are out-of-date, I found the documentation of scikit-learn detailed enough and I managed to reproduce the ones I was interested in without any problem.

data-science