There is a lot of interesting material in this chapter. I like the clear contrast between NoSQL and Big Data, the various dimensions of scalability, the big data taxonomy, and the alternative sharing architectures and their impact on use cases. Figure 6.8 comparing queries in MySQL to MongoDB is a wonderful diagram.
Twitter data stream processing is also another common Flume use case. Also, Flume can be configured to provide different levels of quality of service (reliability, performance).
Note that in HDFS, which now has a default block size of 128 MB, physical space is not consumed by partially-filled blocks (unlike the way disk sectors work).
The YarcData (a Cray company) Urika looks like an amazing system, with 1/2 a petabyte of shared RAM!
Twitter data stream processing is also another common Flume use case. Also, Flume can be configured to provide different levels of quality of service (reliability, performance).
Note that in HDFS, which now has a default block size of 128 MB, physical space is not consumed by partially-filled blocks (unlike the way disk sectors work).
The YarcData (a Cray company) Urika looks like an amazing system, with 1/2 a petabyte of shared RAM!