Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of these architectures to help data professionals understand the pros and cons of each.
James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You'll learn what data lakehouses can help you achieve, as well as how to distinguish data mesh hype from reality. Best of all, you'll be able to determine the most appropriate data architecture for your needs. With this book, you'
Gain a working understanding of several data architecturesLearn the strengths and weaknesses of each approachDistinguish data architecture theory from realityPick the best architecture for your use caseUnderstand the differences between data warehouses and data lakesLearn common data architecture concepts to help you build better solutionsExplore the historical evolution and characteristics of data architecturesLearn essentials of running an architecture design session, team organization, and project success factorsFree from product discussions, this book will serve as a timeless resource for years to come.
It’s been a while since I read a strictly technical book, so I had it brewing on my shame list for a while - was not rushing, expecting it to be a tough lecture. I am happy to say this is not the case here - it’s easily digestible; clearly James is not only an experienced engineer but also gets along fine with non-code written communication.
This actually makes it both a good thing and bad thing (making it beautifully in line with the only proper answer to any question in the IT industry; in this case the question being: is it a good book? It depends).
This is a very good book for someone who just “browses” for current data warehouse concepts (and buzz words, cause that’s what Data Mesh is imo) and wants to be up to date with the trends (book is from 2024 and it shows) - but not a very good one for someone who looks for specific implementation details and list of tools/solutions. I understand author’s decision to keep the book vendor-agnostic as possible (especially since he works for one of the biggest ones) but there are some places where I would love to see implementation details of specific technologies and differences between them - sure, I can (and I did) google for examples, but that makes the book a set of guidelines, rather than specific instruction. It will make the book age better - but at the same time, it’s not as useful as it could be.
Still, for someone working with (or simply interested at) data storage ways, it is absolutely worth the read.
Shallow content and sometimes misleading. 2 stars in total. Good for non-technical people in search of a glossary of data-related definitions. I personally didn't get much out of it.
As a mid-level data engineer with over 3 years of experience, I've encountered many "gotcha" moments – moments when I've realized why we used a certain approach at Company X or why we faced challenges serving real-time data at Company Y.
During interviews, I used to mention that we had previously implemented an architecture combining a data lake and a data warehouse (RDW), which I now refer to as MDW :). I now have a broader understanding MDW.
In addition to these insights, I plan to revisit the chapters on data mesh, as they initially seemed challenging. This book should be a staple in every data professional's library, serving as a valuable reference for different projects and architectures.
У сучасних розподілених архітектурах більшість розробників рідко стикаються з репортингом і аналітикою. Ці функції зазвичай виконує окрема команда, а знання більшості розробників обмежуються тим, що "ми опублікували івент". Часто розробники вважають, що все зводиться до коду, бізнес-логіки і технологій, тоді як дані сприймаються як просто результат цих процесів. Насправді ж, дані – це основа всього.
Тепер щодо книги. По-перше, книга "Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh" не відповідає на головне питання свого заголовку. Тема дійсно стосується data science, і не дивно, що навколо неї є безліч складних абревіатур, які позначають концепції, що не дуже відрізняються одна від одної. Загалом, книга надає поверхові пояснення всім цим термінам. На мою думку, вона занадто розтягнута в описах, але при цьому не надає інформації про найцікавіше – технологічні рішення, основні проблеми тощо.
Я починаю думати, що варто оцінювати кожну книгу за двома критеріями. Отож, якщо ви з тих людей, які поняття не мають, що відбувається з даними після того, як "ви опублікували івент", ця книга може бути корисною для вас. Вона допоможе зрозуміти на високому рівні, що і як відбувається у світі даних – для цього я би оцінив її на 7/10.
Проте, якщо ви вже розумієте, що таке розподілені файлові системи, ETL, реляційні/нереляційні моделі даних, OLTP/OLAP, федерація даних та маєте базові знання data science, ця книга може вас розчарувати. Вона не пропонує глибокого аналізу чи практичних порад і в основному залишається на поверхневому рівні – для цього я би оцінив її на 5/10, і то лише для того, щоб підхопити якісь нові деталі.
Таким чином, "Deciphering Data Architectures" може бути корисною для початківців, але не варто очікувати від неї глибокого аналізу або практичних порад для досвідчених спеціалістів.
Good “no-nonsense” overview of current architectural trends-MDW, Data Fabric, Lakehouse, Data Mwsh. Not clear the audience-at one moment the author begins to explain what is a database index. And while I agree with the author about limitations and overhype of Data Mesh…do you really need to dedicate the the largest part of the book to it in this case?
Strong start and good overview of architectures in the abstract. Deflates at the end going on an escapade about the Data Mesh. It sounds like an atrocious idea, the author thinks it’s a terrible idea, so why devote so much of the book to it? While keeping the book high level the space could’ve been used to expand on the other normal architectures.
A brief introduction to the history of database design/architecture, and the logical reasons behind each technology. How each architect has advantage over others (hint: to match people and process based on its industry context).
The book however, does not fully cover all the technologies used in detail - just some information to grasp all data things for managerial levels
It’s a good (high level) reference book covering different data architectures used currently.
The author does a very good work on addressing (and outlining through the topics) the fact that architectural decisions always have pros/cons on different aspects.
This is a good read for data professionals that want to have a helicopter view on the modern/existing data architectures.
Strong tutorial for those catching up with technology changes
This particular reader had some fairly limited but nonzero knowledge of cloud computing, big data, and data warehousing, but only at a superficial management level. This book provided a tutorial on the subject with up-to-date information. It was readable and understandable.
Great book. Essential to learn basic of Solution Architects general knowledge to decipher and understand the various data architectures available, the book enables non-experts to have the right conversation with architects and data engineers in their firms so you can all be on the same page.
good. modern. sufficiently comprehensive guide on data architectures. will help most people clarify terms and understand the reasons data lakes, data fabric, data lakehouses came into existence and the problems that they solve