Artificial Intelligence is one of the most exciting technologies of the century, and Deep Learning is in many ways the “brain” behind some of the world’s smartest Artificial Intelligence systems out there. Loosely based on neuron behavior inside of human brains, these systems are rapidly catching up with the intelligence of their human creators, defeating the world champion Go player, achieving superhuman performance on video games, driving cars, translating languages, and sometimes even helping law enforcement fight crime. Deep Learning is a revolution that is changing every industry across the globe.
Grokking Deep Learning is the perfect place to begin your deep learning journey. Rather than just learn the “black box” API of some library or framework, you will actually understand how to build these algorithms completely from scratch. You will understand how Deep Learning is able to learn at levels greater than humans. You will be able to understand the “brain” behind state-of-the-art Artificial Intelligence. Furthermore, unlike other courses that assume advanced knowledge of Calculus and leverage complex mathematical notation, if you’re a Python hacker who passed high-school algebra, you’re ready to go. And at the end, you’ll even build an A.I. that will learn to defeat you in a classic Atari game.
A highly interesting and unique book on the subject, which teaches you how to create [deep] neural networks from scratch. In my opinion it could have been been better if it included a little math on the side. Also while the first half of the book holds your hand a lot, the second half picks up the pace way too much.
I have yet to find another resource that is able to effectively capture deep learning—without the overuse of frameworks—in a fundamental way. That being said, I did have some experience with DL paradigms before reading this work, so I’m not sure whether or not it was everything that it is meant to be.
1st Chapter - Easy to skim through. I’m really not confident that it is necessary. Could’ve just given a bulleted list of necessary tools in the preface.
2nd Chapter - He does a decent job introducing the concepts, but I’m not sure that he captures nonparametric models quite right for the person without any experience. He at least acknowledges this by informing the reader that “there is a gray area between parametric and nonparametric algorithms.” My bigger problem is that he calls these counting-based methods. While technically correct, I do wish he would’ve used the clearer grouping-based. Counting leaves it a bit unclear as to why/how nonparametric models find what to count; whereas grouping is a bit more intuitive. This is a minor quibble.
3rd Chapter - This is one of the best abstract-to-specific introductions I’ve seen to a feedforward neural network, but he never explains weight initialization to the reader. I’ll note that his lack of discussion of weight initialization and its importance was a huge knock on this book for me. But especially in an intro book, the reader isn’t going to know about how/why weights converge, and thus throwing in manually assigned weights (and not calling random numpys to do it) makes it a smidge unclear as to the reasoning. I’d be surprised if a few careful readers weren’t confused. It isn’t super complicated to explain, but is pretty vital.
4th Chapter - I was very pleased that a simple discussion of why we square the error. A lot of texts assume that the people learning are familiar with the common MSE practice in statistics, when many people are not. I’ve seen this paragraph discussion in a few places now, but happy to see it here as otherwise people get a little too hung up on the whole squaring business. On the negative side: “hot and cold” learning is ridiculous. It only serves to confuse what is otherwise a straightforward process. I would do away with this entirely and just jump straight into gradient descent. However, I admit that might just be because I am used to calculus. Perhaps this makes more sense to someone who isn’t familiar with anything past college algebra. There are also some minor typos in the er_dn, er_up comparisons (depending on the version of the book) so be wary of that. As per his conversation on derivatives...he used a lot of words to describe slope. The definition is fine, but could be cleaned up.
5th Chapter - There isn’t really much to say about this one. Just goes into weighted averages and matrix operations. Important, but one of the easier parts of NNs to understand if you fully understood the priors.
6th Chapter - Ahh, backprop. Once you know it, you wonder how it was ever hard to understand in the first place. There is a lot of good work done in here, but I think this is the first time that you really need to do the code sample in order to understand what is going on mathematically. Now, once you get to the section about “Up and down pressure” I think there is a lack of clarity in the directionality of it. It would’ve been much clearer if he would’ve skipped the abstract entirely here and moved to actual weight changes. Important to note that his overuse of abstraction is probably the throughline behind many of my criticisms: it isn’t that the abstractions are wrong, rather that they are not useful at best and confusing at worst. This is one of those cases where he overcomplicates a very simple movement with a poorly formatted table. However, he makes up for it by smoothly transitioning into ReLU functions. Unfortunately, another problem comes up here in the initialization of his weight layers: his neglect to explain the -1 after initialization. I briefly mentioned it earlier, but he fails to talk about why this is something that is done. He also fails to mention why you’re randomly initializing. It is a small thing, but for a book that is about grokking something, not insignificant.
7th Chapter - Some visualizations for intuition. May be useful for some people. I found that I already understood them from looking at priors.
8th Chapter - Was very happy to see that console.log printouts were a big deal in this chapter. Helps for the understanding of what is actually going on in batch gradient descent and overfitting. Dropout was useful, although a little bit simplified. I’m not entirely sure how often that simple version of dropout is used in industry, but even if it isn’t used, the overall messaging is sound.
9th Chapter - Much ado about activations. This is my favorite part of the book, and solely for the reason that he showcased neural networks in a correlation map format. I’d seen these intermediate visualizations of what a NN is performing before, but this is the first time I’ve ever really had it stick that you can apply correlation maps to the results in order to “see”. Very pleased. For all the words scribbled about softmax, I was a little disappointed by how verbose he ended up being here. One diagram with a short paragraph (the sigmoid treatment) essentially explains it.
10th Chapter - Despite my previous complaints about being verbose, I don’t feel like there was anything deep about this chapter on ConvNets. For such a popular processing model, this piece felt very unfinished. Luckily, the intuition is easy to grasp, but the deeper mathematical concepts are skated over. I should note, that this is actually quite easy to grasp if you are familiar with concepts like Hamming Codes, which have an uncannily resemblance here.
11th Chapter - Like a lot of these final chapters, I just feel like the author was giving the reader a taste of what is out there. Many of these subjects are books and specializations on their own, so I can’t blame him for not going into too much detail. That being said, if I hadn’t had experience with NLP before reading this I would’ve been deeply disappointed. His writing on embedding layers is probably some of the least clear writing in this book, which is largely due to his failure to have a separate section on one-hot encoding vs. label encoding.
12th Chapter - Similar to the prior chapter, I think my view here is skewed. I found it quite intelligible due to my knowledge of certain NLP concepts, but I really don’t think it would be very good without that knowledge. I was able to relate these to things like n-back algorithms and the like. He just didn’t spend enough time on what was going on through here.
13th Chapter - I’m unconvinced that the best way to teach a framework is in this ground up manner. I recognize that this is the ethos of the book, but this is one of those cases where I think a top-down approach works much better than a bottom-up one. A lot of this is just rehashing what you already know, which is fine, but maybe a different book altogether.
14th Chapter - Easily my least favorite chapter. Way too short and the intuition building here is almost non-existent.
15th Chapter - This is almost just a, “Hey, here is a fun-ish new thing that I don’t really want to explain,” final piece. Not sure that you should end a book on deep learning on a section called Homomorphically encrypted federated learning when you haven’t been talking about encryption—which is an entirely different field—throughout.
I was actually pretty pleased with this book. My biggest problem is that I’m not sure it knew exactly what it wanted to be. If I had advice, I would grab the intuitions (both mathematical and abstract) and leave most of the coding samples alone. There are just better ways to code through work that don’t consist of copying so-so code snippets. I’m excited to read through a couple more and see how they stack up against this one.
I like the build-it-yourself approach, rather than showing how to use frameworks. This helped to build up an understanding. I did need some previous knowledge of some of the topics - which I personally have - in order to 100% follow, but I know I will revisit this book in the future.
Some parts did take a while to understand, but this is a hard topic. Well thought through and presented.
After Chip Huyen's book, I started https://www.goodreads.com/book/show/2... , and although Sebastian's book is understandable and, I found the sine functions etc a bit advanced for my level (my level = low) and switched Trask's "Grokking Deep Learning".
I found the book to be a soft approach to the deep technical matters, and most of the time, I had a clue about what the author was talking about. Anything beyond my understanding is not author's fault, but the reader's (because me = cucumber).
Now I understand Deep learning is unfortunately really much like rocket science, and one has to improve a lot of mathematical foundations (statistics, probability, derivatives etc). As a regular software developer, I accept defeat: probably nothing I read here would ever help me in my career, neither in daily life, nor job interviews.
Alhough I didn't understand a lot of stuff, I give it five stars. The author is talented, and I plan to read more on the "Grokking" series!
Unlike other introductory books that I read (e.g., Deep Learning Illustrated, Deep Learning for Scratch), this book introduces deep learning from ground up -- by implementing key concepts of deep learning from scratch -- and then tying them together into a toy deep learning framework. This helps learn under-the-hood details while appreciating the benefits in a framework. Also, the exposition is limited to a handful of activation functions; hence, the exposition can avoid getting into calculus, which is a good aspect of introductory material. The exposition does not cover all kinds of prevalent NNs (e.g., GANs). Again, this helps with the deep dive by limiting the number of concepts one has to remember to understand the material.
The only downside of the book is that it glosses over details without explanation or proper introduction of terms, e.g., what is a state in LSTM?
All said, if you are looking for an introductory book for deep learning, then pick this one.
Definitely recommended. I only really read the first half and skimmed the rest. Well explained introduction to neural networks, with good examples. I checked this out from the library but had to return it before I could actually code any of the examples; however, the code was clear and easy to understand. I will probably shell out the cash to buy this one.
Very good first half of the book, introduction to deep learning without using framework, code explained step by step. Second half requires either previous knowledge or studying it in details as it has more theory and bigger code samples (It was my first position on deep learning).
Excellent book! Best explanation of deep learning I have ever seen! Unfinished because I wish I had some real project to apply/test this knowledge on, but right now reading this book felt a bit too abstract. I will surely come back to it if I decide to get deeper into machine learning.
I was inspired by what neural networks can do and decided to dig deeper into the deep learning concepts. I was recommended to start with "Grokking Deep Learning". I was told that the style is engaging, the content well motivated, and the author does not hide the maths - without being hard to follow. I felt that this wasn't a great advice (perhaps it would have been for somebody else) as I spent month after month trying to understand the deep learning concepts presented in this book and made no progress. After more than a year I knew I had the wrong basis for understanding the subject. By this time I almost gave up on deep learning. Many of the examples used in the book related to knobs and baseball teams were irrelevant to me and made the whole concept even more confusing. Many deep learning concepts were skipped, not simplified enough for simplicity of explanation and not explained with enough examples. Nevertheless great writing style, astonishing authoring skills and ability to communicate. Authors sharp wit and humor made the concept of deep learning very entertaining.
This was a great read. At first I had qualms about its usefullness, but the more I read the more I liked this. Even though it does not include many mathematics, it is great at tying the maths to a more abstract, high-level understanding.
The way the concepts are described is delightful and intuitive, and the Python code helps in a lot of cases. I did have to skip some parts, as I didn't find much use, but overall it was a great refresher for previous knowledge. Plus, it helped me understand some concepts more deeply.
All in all, if you are prepared to skip ahead at times and want to gain a more intuitive understanding of Deep Learning, I highly recommend this book.
The only thing I thought could improve this was more examples of how to do something more meaningful with your knowledge. Lots of hard coded vectors until the last 3 or 4 chapters and then the Shakespeare output was not that great. Explains the basic concepts and more difficult ones quite well though. Good beginning for a further exploration with other books.
You build your own PyTorch lite to learn how it works in and out --a pretty cool approach.
Was hesitating between 4 and 5. In general book is detailed, illustrated with examples and contains the answers to questions that will appear. But tho it's not as easy to grasp as 'Grokking algorithms'. Also, mathematical references are explained ad hoc which is not really convenient for people with some mathematical background -- had to skip a lot. Probably would be awesome to mark those parts as optional. Giving 5 tho because of the pros.
What this book does best is making you frustrated and almost angry.
Especially by last chapters.
Every little mistype error in presented source code, wrong indentations, or simply incorrect wording makes it even more a chore to get through.
But it gets you interested in underlying concepts and aim to win this all at last. And start reading something like Ian Goodfellow's Deep Learning book from 2016. And dig deeper....
A really great introductory book on Deep Learning. The author does a huge amount of "hand-holding" for the first half of the book. At some points this becomes almost boring, however, this is great because you can follow and learn step by step. A bit after the middle of the book, the speed changes totally. I guess he did it by design, but I have the feeling that he spent much more time and though to prepare the first part of the book, than the second one.
That's "Hello, Startup!" in a world of Artificial Intelligence and Deep learning. Book goes through basics. You start by building everything without frameworks so there's no such thing as "what the hell this code is doing" because you see each operation. Although in the middle of the book this started to become burden and I've lost track from time to time, in general everything is pretty clear
An introduction to deep learning. Not as good as Grokking Algorithms. Spends too much time on the basics, and covers some quite advanced topics in the end. Also contains numerous small mistakes and oddities. On the plus side, it does give a good understanding of how neural networks work, with many hints on how to think about them.
Focusing on the core concepts of deep learning this book runs through examples that get you to start creating core building blocks yourself. Sometimes the best books are not particularly thick but have been edited down so they are focused and manageable. This is easy to get through in a reasonable time and will help most people improve their understanding of deep learning.
El libro es interesante, te enseña sobre deep learning y te muestra como construir tu propio framework de deep learning y al final tu estes familiarizado con pytorch.
Es un libro que se recomienda para las personas que estan empezando con deep learning.
There are bunch of errors in code examples and some topics are really difficult to understand because the way author offers the analogy to real life. However, I would recommend this book for the beginners like me to get understanding of basics.
A gold mine the only reason this isn't a 5 is due to the latter chapters where the things get rushed , instead the author should have given that space to the earlier chapters and made them worth x10 what they already are . The downfall start from the cnn chapter.
Overall, the book is really good. It explains the basic ideas behind neural networks and some different architectures commonly used. Nonetheless, even though most of the graphs and the way of explaining are intuitive and simple, sometimes, the concepts could be a little bit clearer.
I read only the first few chapters. It is good as an introductory book highlighting the details of implementing a neural network step by step from scratch