Rate this book

Practical Machine Learning: A New Look at Anomaly Detection

Name: Practical Machine Learning: A New Look at Anomaly Detection
Rating: 2.95 (13 reviews)
ISBN: 9781491914182

Ted Dunning, Ellen Friedman

Rate this book

Finding Data Anomalies You Didn't Know to Look For

Anomaly detection is the detective work of machine learning: finding the unusual, catching the fraud, discovering strange activity in large and complex datasets. But, unlike Sherlock Holmes, you may not know what the puzzle is, much less what "suspects" you're looking for. This O'Reilly report uses practical examples to explain how the underlying concepts of anomaly detection work.

From banking security to natural sciences, medicine, and marketing, anomaly detection has many useful applications in this age of big data. And the search for anomalies will intensify once the Internet of Things spawns even more new types of data. The concepts described in this report will help you tackle anomaly detection in your own project.

Use probabilistic models to predict what's normal and contrast that to what you observe
Set an adaptive threshold to determine which data falls outside of the normal range, using the t-digest algorithm
Establish normal fluctuations in complex systems and signals (such as an EKG) with a more adaptive probablistic model
Use historical data to discover anomalies in sporadic event streams, such as web traffic
Learn how to use deviations in expected behavior to trigger fraud alerts

GenresTechnologyNonfictionComputersSoftwareTechnical

66 pages, ebook

First published July 21, 2014

4 people are currently reading

79 people want to read

About the author

Ted Dunning

14 books3 followers

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

2 (4%)

4 stars

9 (20%)

3 stars

20 (45%)

2 stars

11 (25%)

1 star

2 (4%)

Displaying 1 - 13 of 13 reviews

Petr

437 reviews

October 21, 2018

A short introduction to anomaly detection that touches on basically the whole range and uses examples to demonstrate the points. It is not uselessly verbose, but I would still prefer to have a more structured approach and have some bullet-points at the end of each chapter. The author also provides GitHub repositories for those who are interested in the code that was used for the examples, but the text itself is programming language independent.

ebook turingy

Charles

5 reviews

September 4, 2017

Pretty light on technical and mathematical detail, overall. For the first few chapters, the book just comes across like a long advertorial for t-digest (which, I hasten to add, isn't explained in any real detail - it just gets introduced at the beginning of Chapter 3 as a "there's this algorithm, use it" type thing). The first chapters also give quite an overly-lengthy introduction to some basic concepts in time-series analysis.

The fourth chapter is a lot more interesting, as it discusses using a streaming clustering algorithm to a decompose a complex signal - but again, it's missing detail, which is a genuine shame. This lack of detail leaves me with more questions than answers - e.g. how is a signal reconstructed using the output of the streaming K-Mean? What representation is used for the windowed signal data to input into the K-Means algorithm?

Overall, this book would have been better if it was about half the length and written as a blog post... or if it were two or three times longer and included implementation details. To finish on a positive note, it did give a nice useful introduction to some ideas that I plan to take and apply to other real-world scenarios - but it was just that - an introduction. I must add - the author has included a set of nice examples on his GitHub account which contain example code and data to go along with the explanations. This helps a lot to solidify several of the concepts introduced in the later chapters, and makes the lack of detail in the book a little bit less troublesome.

Vinayak Hegde

707 reviews93 followers

January 13, 2018

This book has some good ideas about Anomaly detecting such as using t-digest for setting thresholds, using deep learning and auto encoders for detecting anomalies. It also explains the concept of seasonality and the usage of reconstruction and diffing with past patterns quite well.

However it leaves one unsated on the technical front. There are links to further studies but many of the concepts are not completely fleshed out. A lot of sections seems quite repetitive as well.

tech-programming

Simón

158 reviews

October 28, 2016

Short but good overview of Anomaly Detection.

For those familiar with statistics, there won't be any surprises here: some concepts and names will be introduced but it will feel like a natural step forward through the same path. For those without statistics, the book can still be useful as everything should be easy to follow, but learning about statistics will likely help moving forward.

professional

Arvid Nybrant

2 reviews

November 20, 2021

The title "A new look at anomaly detection" signals that the book will provide some new and interesting insights in regard to anomaly detection. It did not. There is far better material on the subject out there. If you're a somewhat technical person, this is likely not a good book for you.

Arun

211 reviews67 followers

September 29, 2023

Good overview on anomaly detection techniques (could have been even shorter). Gives me a lot to think about using some of these techniques at work. We already use one of the author’s t-digest algorithm (a custom implementation in C) for a different use case, woot!

Horia Calborean

419 reviews1 follower

September 11, 2017

Horrible

Honza Drchal

1 review

September 19, 2017

Only rough explanations were given with important details missing. No references to other literature! The link to the last example source code contains an empty repository.

David Westerveld

282 reviews1 follower

November 15, 2017

Excellent high level summary. Not much math and certainly no in-depth exploration but a great little introduction to some of the basics of machine learning for someone new to the field

Andrew-John Hickman

10 reviews

December 5, 2017

An overall decent preview of new techniques for systems to tune them to better recognize anomalies among all the data points being introduced in the modern era.

Tom Lous

42 reviews2 followers

August 24, 2016

The downside of ebooks is that it not immediately clear that a book has only 66 pages of content, where you would expect a book with a similar title to at least occupy you for more than a afternoon's read.
The book has some nice overviews of anomaly detection, but obviously doesn't really dive into the matter.
The book als seems a bit constructed around this 't-Digest', which was thought up by the author Ted Dunning. To be fair. In my limited capacity the t-Digest seems like a very good way to estimate medians on distributed / streaming data.
However it would have been more 'fair' to name the book: 't-Digest: Using machine learning to estimate streaming data statistics' (or something a like)
Which would't have gotten my hope up of actually learning more (then 1 thing) about anomaly detection.

Brian Powell

195 reviews34 followers

August 3, 2020

This book is too short and too terse to be of much value to probably anyone. The chapter on t-digest doesn't actually tell you what t-digest does or how it works. The following chapters cover simple use-cases that don't extrapolate well to actual, real life machine learning problems. None of them, for example, would be able to detect the freaky-ass fish swimming the wrong way on the cover.

machine-learning