Jump to ratings and reviews
Rate this book

Entrepôts de données, guide pratique de modélisation dimensionnelle

Rate this book
Single most authoritative guide from the inventor of the technique. Presents unique modeling techniques for e-commerce, and shows strategies for optimizing performance. Companion Web site provides updates on dimensional modeling techniques, links related to sites, and source code where appropriate.

400 pages, Paperback

First published February 2, 1996

647 people are currently reading
1749 people want to read

About the author

Ralph Kimball

23 books27 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
434 (42%)
4 stars
375 (37%)
3 stars
152 (15%)
2 stars
39 (3%)
1 star
10 (<1%)
Displaying 1 - 30 of 61 reviews
Profile Image for Cedric Chin.
Author 3 books165 followers
May 29, 2020
The best thing about this book is that it is the book on dimensional data modeling, and it is written by the people who invented the approach in the first place.

The worst thing about this book is the organisation.

In the first two editions of The Data Warehouse Toolkit Kimball et all decided to organise the book according to use case, which meant that each chapter examined one particular business application (e.g. CRM, Inventory, eCommerce, Insurance, and so on). This was absolutely terrible, because it spread the core principles out over many, many chapters, and organised it in such a way that you couldn't read a random chapter without first familiarising yourself with the ideas presented in all the chapters preceding it.

So in the years since the 2nd edition came out, numerous students of the Kimball method published their own versions of the book, presenting the core ideas in a compressed, principle-by-principle form.

The 3rd edition of The Data Warehouse Toolkit solved this problem by adding a new chapter (Chapter 2) that laid out all the ideas in one place, and referenced where they were introduced across all the other chapters.

If you want to read this book, read the 3rd edition, and read Chapter 1 and Chapter 2 first, before using Chapter 2 to jump around to each idea.

The authors do themselves a disservice by organising the book this way. On top of that, many of their implementation notes have not aged well. The ideas around the star schema are worth reading. But many other implementation details that assume RDBMS performance problems are no longer as relevant.

I hope the Kimball group updates this classic to reflect the realities and capabilities of the modern, cloud-based MPP columnar data warehouse in the near future.
Profile Image for Anh Dang.
10 reviews6 followers
February 19, 2020
The book is useful that I have learnt many things about the fundamental framework of dimensional modeling. I could see points in the book immediately in some data warehouse I actually work with in reality (yet, more applicable with old-school Data Warehouse). Not much surprised, considering the book is published few years ago and Data Industry changed astonishingly rapidly.
Agree with some comments that the book is unnecessarily long, with the structure could be improved by gathering by technical point, rather than repeating things from times to times. The Big Data is mentioned, but not much. If you read in 2020, you might feel a little bit outdated. Yet, the ideas of dimensionalization is great and very much valuable.
4 reviews2 followers
September 6, 2021
As the title says, it is a complete guide. What it doesn't say is that tech books can be humourous too. It might be the first book I have read completely with great interest. The two components it takes care of in the complete book is that,
1) Simplicity
2) Performance
The two components to be focused on while data warehousing are:
1) Requirements
2) Data reality
The generic steps for any data warehousing are:
1) Select business process
2) Declare the grain
3) Identify dimensions
4) Identify Facts
Profile Image for Joe.
4 reviews2 followers
October 26, 2011
This book is for folks that already understand relational modeling and pitfalls of extracting reports/metrics by directly querying transactional systems.

The second edition was published in 2002, but even so, some of the recommendations seem a bit outdated even for back then. For example, he refers to RDBMSes that don't support table/column aliases and hence recommends creating many redundant views to work around those limitations.

I'd rather the book be aimed at people using modern tools and let folks using older, antiquated tools come up with their own workarounds rather than proposing everyone use the least common denominator. However, that was really my only complaint.

The case studies and examples are well-chosen and highlight many common business scenarios.
Profile Image for Maciej Górczak.
17 reviews
August 18, 2021
In the era of cloud computing this books seems to be outdated. In addition it is dry and I wouldn’t recommend it to people working with newest technologies. Certainly it is not useful for developers of any kind.
Profile Image for Timothy.
80 reviews1 follower
February 3, 2022
Comprehensive overview of classic data warehousing concepts.

It seems to me you could build a great functional data system for a small to medium sized company by following this thing to the letter.

It doesnt really touch upon the most modern data storage strategies or techniques, but companies who are looking to deploy those practices probably know who they are and are familiar with the concepts in this text. A great reference for tried and true data strategies.
Profile Image for BCS.
218 reviews33 followers
January 6, 2014
Increasingly, data is becoming one of an organisation’s most valuable assets. However, the potential value of this data can only be fully realised if it can be organised in ways that facilitate reporting and mining by a range of consumer types.

In order to support reporting across different data silos, data is often integrated into data warehouses, intended to provide a 'one-stop shop' for all reporting needs.

This book describes a principled and pragmatic approach to the organisation of data warehouses using the Kimball Methodology.

After an introductory orientation to data modelling and the Kimball methodology, chapters 3 to 17 each present case studies focussing on the specifics of different industry types and reporting requirements. By building relevant dimensional models, these chapters bring out the challenges presented by various kinds of data, data relationships and reporting requirements.

While most of these chapters start from scratch, chapter 10 offers a slightly different perspective by providing an opportunity to review and critique a proposed dimensional model as if stepping into an in-process data modelling exercise.

These modelling chapters follow a general pattern, which reiterates the importance of early grain declaration and emphasises the use of the bus matrix both in helping to identify relationships between dimensions and applications and as a crucial tool in the development of conformed dimensions and in documenting the data warehouse.

These chapters also explore strategies for identifying and handling different types of slowly changing dimensions and the effects of different data organisations on reporting capabilities. There are useful general hints, anti-patterns and heuristics embedded in each chapter.

The latter chapters of the book move away from dimensional modelling and focus on the key aspects of the Kimball BI/DW life cycle itself. A useful introductory chapter describes the overall life cycle and principal pitfalls. Subsequent chapters delve deeper into the running of the data modelling process, including the identification of key people to input into the process.

There are pointers to the authors’ website for additional resources, such as document templates. The final few chapters include a detailed description of a prototypical extract/ transform/ load process and a guide to the ETL development process, including a brief discussion of real-time ETL. The book concludes with an outline of the state-of-the-art regarding DW/BI and big data.

This book is written in a very approachable and readable style. While the book is intended to be read as a whole, bullet-point chapter highlights and chapter summaries as well as detailed contents and indexing make the book easy to use as a reference text. In addition to a range of design patterns covering a multitude of scenarios, there is lots of practical advice based upon the authors’ extensive experience in the field.

Reviewed by Patrick Hill CEng MBCS CITP
188 reviews4 followers
Read
March 17, 2019
The Data Warehouse Toolkit is written as a self-help book for IT professionals. While I generally dislike it when other people tell me what to do, Ralph Kimball is among the more readable authors. He stresses important points that you might sweep under the carpet even though common sense dictates that you should not, such as getting business sponsorship, investing time and effort in conformed dimensions, and refraining from normalizing the dimension tables.

In order to boost his sales among business executives, Kimball avoids the term 'functional dependency'. This is a pity because the book can be summarized in less than half of its original number of pages if you are allowed to refer to the concept of functional dependency. Still, they would remain interesting pages and I am actually glad that I read this book. Kimball is not Edgar Codd but ETL and denormalized dimension tables are useful additions to a practitioner's arsenal.

Chapter 11, though short, is particularly interesting because it discusses car wrecks. Car wrecks, at least the metaphorical kind, are fun to watch from a safe distance; I am of course referring to IT projects that failed due to bad design.

Have I mentioned that this book is too long? The reading became a bit tedious with the n-th repetition of the same lesson in yet another business context, and even to a non-native speaker the English grammar would sometimes appear unorthodox. The final chapters seem to have been added in later editions but their integration with the basic material from earlier chapters is superficial.
Profile Image for TΞΞL❍CK Mith!lesh .
307 reviews191 followers
September 30, 2020
A best-selling book on business intelligence, ‘The Data Warehouse Toolkit’ starts with a short section about the theory of data warehousing and analytics, moving onto a selection of case studies showing how to apply the theory to common business scenarios. It’s also one the best books for building a BI system.

There are 14 case studies included in the book that stem from numerous industries, such as electronic commerce, procurement, order management, finance analytics, or human resources.

At first glance, the table of contents appears as if the industry-specific chapters apply only to certain sectors, but you’ll quickly find that these industries are used as examples to help readers better understand the underlying design principles. The book covers both technical and business aspects and is up-to-date with state-of-the-art practices and topics. The content covers dimensional modeling techniques and mistakes, bridge tables for ragged variable depth hierarchies and multivalued attributes, project management guidelines, and a comprehensive review of extract, transformation, and load (ETL service) systems and design considerations. A must-read for anyone dealing with data on a daily basis.
Profile Image for Tom Schulte.
3,352 reviews73 followers
June 6, 2012
Kimball can't seem to stop himself from knocking the 3NF & 4NF states of normalization ideal for largely OLTP systems as he lays out a coherent, cogent approach to pure-OLAP DWH systems where space is cheap and denormalization is a wide avenue to robust data analysis. This books helped me truly grok the importance of keeping measures additive in dimensional modeling design. While the chapters of case studies for different industries begin to seem redundant after this first few, tucked into the final chapters are more gems: surrogate keys, SaaS pros & cons (here called ASP; Kimball dating himself), common design mistakes and spread all around some very good discussion of amplifying dates into dimensions. This includes details including timezones and daylight savings in case studies on transportation and e-commerce.
Profile Image for Bryan.
19 reviews9 followers
January 24, 2008
Everyone I know would refer to this as the Bible of Dimensional Modeling. An excellent introduction with a good degree of depth and written in a case study style that makes it easier to read and digest.

Very numeric fact driven though and as Data Warehouses store more textual style facts some of the principles need to be put in context. This book gives great principles, but as with most things don't take them and think that you can apply them in a black and white, rule-based way. There are several times when Ralph basically leaves gray area out there for interpretation based on one's individual circumstance.
Profile Image for Luca.
78 reviews16 followers
May 16, 2016
Must read if you're interested in the topic. I didn't go for the five stars because the style if the book is a bit boring. I asking it's not easy to make the argument entertaining too
Profile Image for Kellan Marvel.
14 reviews5 followers
December 31, 2021
This is one of those books I'll never actually "finish" but it's an excellent reference guide.
Profile Image for Antoni Heba.
11 reviews
July 17, 2025
Who is to blame for filling the lives of BI developers with star schemas, dimensions and fact tables? After years of building those structures I finally decided to find the culprit and I learned that it's not Ralph Kimball. He merely perfected the method and gave a complete description of it in his book:"The Data Warehouse Toolkit", coauthored with Margy Ross. Well, I guess that makes him kind of guilty for the countless star-schemas that populate the firmament of data warehousing projects. But is the book still worth reading?

I personally think that the enormous success of the dimensional method is due to its simplicity. The next factor would be its capacity to be implemented in many different environments. This book certainly popularized the method. It insists on simplicity right from the start. It also stresses the importance of putting business first. Those principles are repeated like a refrain all throughout the book. Demonstrating through applications in different businesses is surpisingly effective as it helps to understand and memorize the techniques. It also shows the universality of the model. Generally, the book is very well thought out. One has the impression of being led on a path that starts from simple things and goes all the way down to more complex ones. Even the addition of two chapters about ETL looks like a natural ending. Once the modeling is done, it's time to build the thing. And last but not least, the book is written in an easy and coherent style, sprinkled with humor, which only helps in its digestion.

The meal is a bit too heavy, though. Some of the cases could be made shorter and the chapter about big data seems to have been added to catch up with the changing reality without adding much value. But the biggest downside is the lack of a clear definition of a data warehouse. All we get is a description of its parts and of its functions. But a list of ingredients doesn't make a dish by itself. Another problem is that integration of data according to the author boils down to conformed dimension and facts. Now in my experience that is not enough, and scaling with that approach is a problem.

Downsides aside, this is THE book about dimensional modeling. It has definitively sharpened my understanding of it and answered questions that I have asked myself so many times. I also found there a defense of my efforts to keep users in mind and make my modeling as simple as possible. No wonder that sentences such as: "In many ways, dimensional modeling amounts to holding the fort against assaults on simplicity." or "Without business users, the DW/BI system is a technical exercise in futility." were music to my ears.
144 reviews5 followers
October 12, 2022
I will start off by saying the ideas presented in the book are excellent, and I am looking forward to using them in a project that I am currently working on. Further, the book has increased my understanding of the data environment in my current role, allowing me to increase my utilization of SQL for data analysis instead of dumping a bunch of raw data into an Excel sheet and trudging through the slow process of grouping/filtering in Excel. (And oftentimes, it is faster than parsing through the data with Python, even).

I did take away a star from the book for two reasons. First, the book is painfully repetitive. I fully understand and support the idea of tying concepts back together. However, especially in the case studies chapters, the authors don't necessarily need to cover all of the topics from the previous chapters in the same level of detail as the previous chapter.

Secondly, the conceptual topics are covered in great detail, but when it comes to practical SQL, more code examples would be helpful. One of the main topics where this poked its head for me is the Bridge Table concept. In this case, there is an example provided in the additional material, but the example actually only produces an adjacency list table, which is only halfway to the problem that you're looking to solve. (n.b.: If you find yourself in the same situation, look up John Simon's solution - he writes in T-SQL, but I was able to replicate with SQLite using a Python cursor, and then also in PL/pgSQL for a true SQL/PL solution).
Profile Image for Tim Tilberg.
9 reviews
August 14, 2020
This book is an absolute slog to trudge through, but it's a must read if you work in the technical side of business intelligence and data warehousing. This book prescribes a recipe for implementing an enterprise-class data warehousing strategy featuring everything from understanding the business, getting buy-in from all departments of a company, technical challenges you will face and how to handle them, and finally the technical strategy of how to make it all work.

Each chapter presents like a white paper of a company trying to get their data under the belt, with new challenges along the way. Often the challenges are critical, and immediately obvious when it's pointed out, but not something you think of until you see it: This is why somehow, some way, you must read this book. I had to start a book club at work to get through it.
Profile Image for Martin Ridgway.
184 reviews1 follower
February 15, 2018
I ended up skip-reading the second half. The book is divided into a number of chapters themed on various industries and it gets rather repetitive - telling you more about that industry than the things needed to build a data warehouse. The same technical points get made again and again whilst new ideas are dropped in in an unstructured way as needed in (say) chapter 12.
I had to drag the necessary information out of the book.
Plus - it is a little old now and software and hardware have taken enormous leaps since it was written (this might be my fault if this is an older edition) so a more up-to-date version is really needed; hopefully with a lot of re-editing to clarify the structure.
Profile Image for Dustin Steinacker.
74 reviews
Read
October 18, 2019
This is the longest book I've read since Stephen King's It, and aside from the end dragging a bit they don't have anything in common. A fantastic introduction to dimensional modeling in a DW environment, which doesn't presuppose the reader is familiar with anything beyond basic relational data concepts. The focus on case studies is welcome; seeing each principle and specialized technique at work in various contexts helped me to internalize far more than I would have in a more removed setting. Rather than teaching a principle and leading into a "for example," the book is structured around the "for examples" and teaches principles around that.
Profile Image for Monzenn.
837 reviews1 follower
March 13, 2025
A surprisingly readable yet dense discussion of a data warehouse. Lots of stuff on best practices about building the best data structure for your use case, and its insistence to read through the use cases makes a lot of sense. SQL usage is notable too, but the best thing about this book is that not having SQL expertise (or frankly interest) will not matter to one's takeaways from the book. I also appreciate the database size estimates, very practical. The light and humorous takes are much appreciated as well.
Profile Image for Dimitri.
206 reviews2 followers
September 22, 2024
📕 Why (Not) to read this book (Target Audience)

This book is not too technical and provides practical examples of Bus Matrix for different orginsational needs,

👀 How this book changed my daily live (Takeaways)

The four key decisions made during the design of a dimensional model include:
1. Select the business process.
2. Declare the grain.
3. Identify the dimensions.
4. Identify the facts.
Profile Image for JACKSON CLIAM.
18 reviews
May 23, 2025
PNW Warehousing https://pnwwarehousing.com/transload.... has really helped streamline my business operations. Their transload service is on point and lets me move products through Tacoma without any hiccups. The team works fast but doesn’t cut corners, which is huge. What stood out most is their willingness to adapt to special requests. You can tell they take pride in their work and care about getting it right. It’s been a solid partnership and I plan to keep using them long term.
1 review
August 27, 2023
Taught me a lot about how to build a Data Warehouse. Not recommended for newbies or freshers. Best suit for Engineer/Modeler that have at least ~1 YOE. Even though those data influencers always talk about how outdated this book is, my take on this is that 80% of the content is still relevant to today's stack. It's not just about performance or technology, it's about ideology
Profile Image for Chris.
142 reviews40 followers
December 31, 2018
Published by Kimball Group. Authors are Kimball and the President of Kimball Group. Authority stems from "consulting" and publishing in a made-up magazine.

This book is worth a glance by modern data scientists because it shows the bullshit roots of this bullshit field.
6 reviews
November 10, 2021
The book on dimensional modelling, but difficult to nagivate.

The case studies are good for introducing to the reader some common patters and principles for various industries.

Reading this, you would be able to apply the principles and design and create your own models.
Profile Image for Mohamed Taha.
1 review
January 22, 2023
Was not the right book for me. The book is written for high level managers not for beginners
The examples from businesses were not practical.
The book also has so much redundant information, a single topic is repeated in more than chapter.
The book avoids talking about any tools or code examples.
Profile Image for Bavo Vandekeere.
1 review13 followers
May 5, 2023
The absolute basics and a great guideline for dimensional modeling.
For me, this is the way to implement a great data warehouse design. A book for everyone in data, beginner or expert level. A clear and proven path to follow!
Profile Image for Samuel Situmeang.
31 reviews4 followers
June 3, 2020
The content is excellent. The authors use several read-world examples and the text flows well.
Displaying 1 - 30 of 61 reviews

Can't find what you're looking for?

Get help and learn more about the design.