Jump to ratings and reviews
Rate this book

Practical Machine Learning: A New Look at Anomaly Detection

Rate this book
Finding Data Anomalies You Didn't Know to Look For

Anomaly detection is the detective work of machine learning: finding the unusual, catching the fraud, discovering strange activity in large and complex datasets. But, unlike Sherlock Holmes, you may not know what the puzzle is, much less what "suspects" you're looking for. This O'Reilly report uses practical examples to explain how the underlying concepts of anomaly detection work.

From banking security to natural sciences, medicine, and marketing, anomaly detection has many useful applications in this age of big data. And the search for anomalies will intensify once the Internet of Things spawns even more new types of data. The concepts described in this report will help you tackle anomaly detection in your own project.


Use probabilistic models to predict what's normal and contrast that to what you observe
Set an adaptive threshold to determine which data falls outside of the normal range, using the t-digest algorithm
Establish normal fluctuations in complex systems and signals (such as an EKG) with a more adaptive probablistic model
Use historical data to discover anomalies in sporadic event streams, such as web traffic
Learn how to use deviations in expected behavior to trigger fraud alerts

66 pages, ebook

First published July 21, 2014

4 people are currently reading
79 people want to read

About the author

Ted Dunning

14 books3 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
2 (4%)
4 stars
9 (20%)
3 stars
20 (45%)
2 stars
11 (25%)
1 star
2 (4%)
Displaying 1 - 13 of 13 reviews
Profile Image for Petr.
437 reviews
October 21, 2018
A short introduction to anomaly detection that touches on basically the whole range and uses examples to demonstrate the points. It is not uselessly verbose, but I would still prefer to have a more structured approach and have some bullet-points at the end of each chapter. The author also provides GitHub repositories for those who are interested in the code that was used for the examples, but the text itself is programming language independent.
Profile Image for Charles.
5 reviews
September 4, 2017
Pretty light on technical and mathematical detail, overall. For the first few chapters, the book just comes across like a long advertorial for t-digest (which, I hasten to add, isn't explained in any real detail - it just gets introduced at the beginning of Chapter 3 as a "there's this algorithm, use it" type thing). The first chapters also give quite an overly-lengthy introduction to some basic concepts in time-series analysis.

The fourth chapter is a lot more interesting, as it discusses using a streaming clustering algorithm to a decompose a complex signal - but again, it's missing detail, which is a genuine shame. This lack of detail leaves me with more questions than answers - e.g. how is a signal reconstructed using the output of the streaming K-Mean? What representation is used for the windowed signal data to input into the K-Means algorithm?

Overall, this book would have been better if it was about half the length and written as a blog post... or if it were two or three times longer and included implementation details. To finish on a positive note, it did give a nice useful introduction to some ideas that I plan to take and apply to other real-world scenarios - but it was just that - an introduction. I must add - the author has included a set of nice examples on his GitHub account which contain example code and data to go along with the explanations. This helps a lot to solidify several of the concepts introduced in the later chapters, and makes the lack of detail in the book a little bit less troublesome.
Profile Image for Vinayak Hegde.
707 reviews93 followers
January 13, 2018
This book has some good ideas about Anomaly detecting such as using t-digest for setting thresholds, using deep learning and auto encoders for detecting anomalies. It also explains the concept of seasonality and the usage of reconstruction and diffing with past patterns quite well.

However it leaves one unsated on the technical front. There are links to further studies but many of the concepts are not completely fleshed out. A lot of sections seems quite repetitive as well.
Profile Image for Simón.
158 reviews
October 28, 2016
Short but good overview of Anomaly Detection.

For those familiar with statistics, there won't be any surprises here: some concepts and names will be introduced but it will feel like a natural step forward through the same path. For those without statistics, the book can still be useful as everything should be easy to follow, but learning about statistics will likely help moving forward.
2 reviews
November 20, 2021
The title "A new look at anomaly detection" signals that the book will provide some new and interesting insights in regard to anomaly detection. It did not. There is far better material on the subject out there. If you're a somewhat technical person, this is likely not a good book for you.
Profile Image for Arun.
211 reviews67 followers
September 29, 2023
Good overview on anomaly detection techniques (could have been even shorter). Gives me a lot to think about using some of these techniques at work. We already use one of the author’s t-digest algorithm (a custom implementation in C) for a different use case, woot!
Profile Image for Honza Drchal.
1 review
September 19, 2017
Only rough explanations were given with important details missing. No references to other literature! The link to the last example source code contains an empty repository.
Profile Image for David Westerveld.
282 reviews1 follower
November 15, 2017
Excellent high level summary. Not much math and certainly no in-depth exploration but a great little introduction to some of the basics of machine learning for someone new to the field
Profile Image for Andrew-John Hickman.
10 reviews
December 5, 2017
An overall decent preview of new techniques for systems to tune them to better recognize anomalies among all the data points being introduced in the modern era.
Profile Image for Tom Lous.
42 reviews2 followers
August 24, 2016
The downside of ebooks is that it not immediately clear that a book has only 66 pages of content, where you would expect a book with a similar title to at least occupy you for more than a afternoon's read.
The book has some nice overviews of anomaly detection, but obviously doesn't really dive into the matter.
The book als seems a bit constructed around this 't-Digest', which was thought up by the author Ted Dunning. To be fair. In my limited capacity the t-Digest seems like a very good way to estimate medians on distributed / streaming data.
However it would have been more 'fair' to name the book: 't-Digest: Using machine learning to estimate streaming data statistics' (or something a like)
Which would't have gotten my hope up of actually learning more (then 1 thing) about anomaly detection.
Profile Image for Brian Powell.
195 reviews34 followers
August 3, 2020
This book is too short and too terse to be of much value to probably anyone. The chapter on t-digest doesn't actually tell you what t-digest does or how it works. The following chapters cover simple use-cases that don't extrapolate well to actual, real life machine learning problems. None of them, for example, would be able to detect the freaky-ass fish swimming the wrong way on the cover.
Displaying 1 - 13 of 13 reviews

Can't find what you're looking for?

Get help and learn more about the design.