Jump to ratings and reviews
Rate this book

Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications

Rate this book
Get started with Apache Flink, the open source framework that powers some of the world’s largest stream processing applications. With this practical book, you’ll explore the fundamental concepts of parallel stream processing and discover how this technology differs from traditional batch data processing. Longtime Apache Flink committers Fabian Hueske and Vasia Kalavri show you how to implement scalable streaming applications with Flink’s DataStream API and continuously run and maintain these applications in operational environments. Stream processing is ideal for many use cases, including low-latency ETL, streaming analytics, and real-time dashboards as well as fraud detection, anomaly detection, and alerting. You can process continuous data of any kind, including user interactions, financial transactions, and IoT data, as soon as you generate them.

308 pages, Paperback

Published April 30, 2019

76 people are currently reading
226 people want to read

About the author

Fabian Hueske

1 book5 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
39 (39%)
4 stars
43 (43%)
3 stars
16 (16%)
2 stars
1 (1%)
1 star
1 (1%)
Displaying 1 - 9 of 9 reviews
Profile Image for Ian Wagner.
70 reviews3 followers
February 27, 2022
Probably the best book available on the subject, which is a bit unfortunate. It wasn't awful or anything, but I found myself frequently stopping, reading the docs, and/or Googling in frustration because explanations and warnings simply prompted further questions that seemed obvious to me, but were not adequately explored. Not the easiest ecosystem to break into, admittedly, but I would still say this is probably the best organized intro available.

In particular, one of my largest criticisms (applies to the JVM ecosystem as a whole *way* more than most for some reason) is the amount of (IMO) unreasonable assumptions made. Terms are frequently thrown around without proper treatment. I actually got through the entire book without feeling like I had a complete understanding of what an operator was. The Flink glossary wasn't all that much more helpful, but at least had something. Do yourself a favor and read the Google Dataflow Model paper first and you'll get a *much* more thorough introduction to some of the crucial terms.

Finally, though no fault of the author, this book is old. Flink has evolved significantly since this was written, some of the APIs are deprecated, and some of the other cautions are either inaccurate or difficult to verify (I still can't figure out whether the limitation re: parallelism settings and savepoints is still valid... the book claims it was written for Flink 1.7, but the only limitations I can find in the official docs reference version 1.2).
72 reviews2 followers
November 17, 2020
A very comprehensive book on the ins-and-outs of Flink and I read it cover to cover. I have found myself flipping through it as a reference on many occasions when I am curious about some specific implementation detail. I give it 4 stars only because several important releases have been made since its publication in 2019 and it is dated as some of the most important new features are not included.
3 reviews3 followers
June 22, 2019
A great resource for anyone interested in concepts of stream processing and in depth tour of flink
Profile Image for Marcin Kuthan.
14 reviews10 followers
October 13, 2022
Just a documentation collected into the book, perhaps partially outdated now. The best part of the book is about general streaming challenges and trade-offs, watermarks, sources/sink design, state management. I found this book interesting even if you develop streaming pipelines using different frameworks (beam, kafka streams) just to compare the APIs, capabilities and limitations.

I was really surprised that there is no single page about automated tests. When I evaluate a new framework excellent support for automated tests is a must. I don’t understand why so important aspect was totally ignored.
28 reviews5 followers
November 14, 2019
- An approachable and practical introduction with nice examples throughout the book!
- It first presents the overall architecture and then we cover datastreaming api in the later chapters with each one focussing on one aspect.
- I enjoyed the chapter on integrating flink with other systems like kafka, cassandra etc. and how event guarantees are effected depending on the source and sink. This gives an overall idea of how such systems are deployed especially for the beginners who might not have holistic picture of distributed systems.
- One thing that could have made this book awesome - a smallish hands-on project in the end covering many of the concepts presented throughout. I think this in itself is quite a big task and perhaps deserves its own book.
Profile Image for Łukasz Słonina.
124 reviews25 followers
July 3, 2019
Very good introduction to stream processing and Flink itself (only DataStream part).

The only thing I would change is examples, maybe in Scala they're more concise, but in Java they would be much more readable and easier to follow.

Mandatory position for Flink users.
Displaying 1 - 9 of 9 reviews

Can't find what you're looking for?

Get help and learn more about the design.