A Degenerate Dimension is
a DIMENSION which exists within the fact table alone - as in a Foreign Key
which has no Primary Key.
Usually when a dimension table is growing at roughly the
same rate as the fact table, there is a degenerate dimension lurking that
has been missed in the initial design.
A degenerate dimension (DD) acts as a dimension key in the
fact table, however does not join to a corresponding dimension table
because all its interesting attributes have already been placed in other
analytic dimensions. Sometimes people want to refer to degenerate dimensions as
textual facts, however they’re not facts since the fact table’s primary
key often consists of the DD combined with one or more additional
dimension foreign keys.
The easiest way to understand what we are talking about is via an example.For instance, we might have a FACT table for sales - tblSales - which contains the following data:
CustomerID ProductID CreditCardType
101 5 Visa
102 6 Mastercard
103 7 VisaIn this case, the CustomerID and ProductID are conventional dimensional references - they are Foreign Keys which have a corresponding primary Key table.
However, CreditCardType is also a piece of dimensional information - something we might want to analyze by - but there is no tblCreditCardType. CreditCardType is a Degenerate Dimension.
Where degenerate dimensions come in is that there are often some columns that we want to have, but that are not measures, and don't have a table of stuff we want to join to. Example: a purchase order number. These columns store something that we want to have (the purchase order number), but to create an empty dimension table would only slow things down. So, to ensure we don't feel bad about breaking the "only a measure or a dimension in the fact table" rule, we just CALL them dimensions- even without the table.
In the fact itself, any attribute of the purchase order that was of interest, and that therefore had values that would each have more attributes we would be interested in would have been turned into a dimension, and a dimension table would have been created.
But to create a dimension table that contains a row for every purchase order would create a very large dimension with nothing in it (since there are lots of purchase orders, possibly as many as there are facts if the grain your fact table is one per purchase order). But our users would not be happy if they could not get a list of the purchase orders included in a given total, or be able to drill down to that bottom level of detail that we've gone to all the trouble to include.
So, when we create transactional level fact tables, it is normal, in fact, necessary to include some degenerate dimensions- include columns that have useful information (very often referencing back to the source system) but that do not join to any dimension table. Plus you can just impress everyone with your dimensional modelling knowledge when you say "degenerate dimension"
No comments:
Post a Comment