Rate this book

The Data Science Design Manual

Name: The Data Science Design Manual (Texts in Computer Science)
Rating: 4.27 (12 reviews)
ISBN: 9783319554433

Steven S. Skiena

Rate this book

This book serves an introduction to data science, focusing on the skills and principles needed to build systems for collecting, analyzing, and interpreting data. As a discipline, data science sits at the intersection of statistics, computer science, and machine learning, but it is building a distinct heft and character of its own.

In particular, the book stresses the following basic principles as fundamental to becoming a good data scientist: "Valuing Doing the Simple Things Right," laying the groundwork of what really matters in analyzing data; "Developing Mathematical Intuition," so that readers can understand on an intuitive level why these concepts were developed, how they are useful and when they work best, and; "Thinking Like a Computer Scientist, but Acting Like a Statistician," following approaches which come most naturally to computer scientists while maintaining the core values of statistical reasoning. The book does not emphasize any particular language or suite of data analysis tools, but instead provides a high-level discussion of important design principles.

This book covers enough material for an "Introduction to Data Science" course at the undergraduate or early graduate student levels. A full set of lecture slides for teaching this course are available at an associated website, along with data resources for projects and assignments, and online video lectures.

Other Pedagogical features of this book include: "War Stories" offering perspectives on how data science techniques apply in the real world; "False Starts" revealing the subtle reasons why certain approaches fail; "Take-Home Lessons" emphasizing the big-picture concepts to learn from each chapter; "Homework Problems" providing a wide range of exercises for self-study; "Kaggle Challenges" from the online platform Kaggle; examples taken from the data science television show "The Quant Shop," and; concluding notes in each tutorial chapter pointing readers to primary sources and additional references.

GenresComputer ScienceTechnicalNonfictionTechnologyMathematicsProgrammingArtificial Intelligence

445 pages, Hardcover

Published July 1, 2017

25 people are currently reading

395 people want to read

About the author

Steven S. Skiena

16 books114 followers

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

34 (50%)

4 stars

19 (28%)

3 stars

12 (17%)

2 stars

2 (2%)

1 star

0 (0%)

Displaying 1 - 12 of 12 reviews

mkfs

330 reviews27 followers

May 20, 2018

A good introductory book to ~~statistical analysis~~ ~~data mining~~ data science. This is clearly aimed at students - the Coda at its conclusion exhorts the reader to now get a data science job (no thanks, got a real job already), and there is an expectation in the word-frequency discussion that the reader has never encountered the word defenestrate (ha! just last week I had to defenstrate an intruder!).

It's always good to get Skiena's take on things -- I've read three or four of his books now -- and this one is no exception. The statistical-learner stuff is linked more closely to standard CS topics (e.g. algorithmic complexity) than in most other texts, and the overview of linear algebra is really quite good.

The only real downside is that it doesn't do what is says on the tin. Unlike The Algorithm Design Manual, this isn't presented as a taxonomy of data science methods with a briefing of when and how each should be supplied. More's the pity, as that particular book is sorely needed - even in this one, Skiena points out that most researchers become comfortable with one approach and use it for everything, rather than testing alternate approaches on new problems.

Instead, it's a standard Introduction to Data Science textbook with chapters devoted to topics of increasing complexity/sophistication. Well-written, often entertaining, with an excellent selection of exercises (including many Kaggle challenges and some publicly-available datasets - precisely the sort of project that a beginner needs to get their feet wet).

Dimos Raptis

Author 2 books3 followers

October 7, 2022

This was a nice read. The war stories are very illuminating, it eases from practice into theory quite nicely and the funny quotes interspersed into the war stories are enjoyable. The examples given were sometimes quite illuminating. Some examples are the intuition of p-values via the concept of permutation tests and the conceptual difference between SVMs and logistic regression (maximising margin between the closest points from each side versus maximizing the total confidence of our classification over all points). Other times, things were supposed to be illuminating, but weren't so much (an example is the duality between points and lines in linear regression). This might have to do with the background knowledge of the reader, of course. Theoretical parts were sometimes hard to follow, because they were described very briefly due to the book's character to be a summary of techniques, instead of a deep dive. An is the sudden jump into the explanation of how eigenvalues can be used for clustering, even though the explanations for clustering were otherwise insightful and simple. I settled on a 4-star rating, because it was a nice book I learned a lot from, but there were bits that felt they could use some more editing so that they can be more easily palatable to the reader and this is what kept me from giving a 5-star one.

Cristián S

16 reviews

February 28, 2024

I started this book motivated because I learnt I was going to learn something. What happened was that the more I read, the more I hated this book. I think this book is written as notes for a course, and it is only good for that. Most of the information in it can be read in many sites in internet and in many books. Also, it makes more harm than good, many math concepts are treated with a language that is not clear, probably trying to reach a non so math-oriented audience. But, at the same time, the book assumes most things are already know, it is like commenting on things already known. Finally, the language is very US-oriented, where jokes are not funny at all out of the US. I feel really disappointed, as other books from this author are really good. But not this one.

I don't give 1 start only because I bought the hardcover version and the printing quality is incredible good.

Paweł Kacprzak

2 reviews2 followers

August 19, 2018

Nice war stories and a great chapter about visualizations - this is what is hard to find in other books, and I guess it might be a new read for many computer scientists/programmers. I also appreciate many practical examples. A few other chapters are more or less standard and some of them, for instance, the one about distributed computing, feel like thrown there just to fill the content. Nevertheless, I recommend reading it.

computer-science

Lucille Nguyen

417 reviews12 followers

February 20, 2025

Good textbook introducing (or in my case, refreshing) topics in data science. Readable and designed for a non-math heavy audience, probably for computer scientists who never go beyond a couple calculus courses and discrete math. Sometimes this serves to its detriment, spending multiple pages when one who is versed in mathematical formalism would understand it in one page or less. Nonetheless, a readable and accessible book on the subject.

Sebastian

12 reviews1 follower

January 20, 2019

One of the best resources at the time to go into the matter of Data Science. Highly recommended, as any of the S. Skienna books (which he also recommends reading for a better understanding of some topics).

RiskyReads

79 reviews3 followers

June 13, 2022

I used this for my masters thesis and it really helped with all the tasks and methods used in data science. I do wish there was a little more about verification and validation, but I found the rest of the book very useful.

Duc Nguyen

15 reviews

September 1, 2025

Not good. I find that many math concepts presented in the book are somewhat discrete and unclear. Also, the jokes in this book are not funny, really :). For example: "... The theory of linear algebra works except when it doesn’t work...".

Ha ha ha.

Not for me.