Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Create a fact table or use surrogate keys in dimensions?

Status
Not open for further replies.

Cridal

MIS
Dec 1, 2005
2
US
Hi,

I have a general dimension modeling question.

Let's say I have dimensions called CompanyDim and UserDim. Each user belongs to a company (so Company can be considered an attribute of a User). Both dimensions contain other attributes that are of type 2.

What is the best way of keeping track of this relationship?

Solution 1:
Store surrogate key of CompanyDim as attribute of UserDim (consequences: each type 2 change to CompanyDim forces type 2 change of UserDim, dimensions have to be processed in sequence - CompanyDim first, UserDim second, necessity at query time to do dimension-to-dimension join)

Solution 2:
Create an additional factless fact table of UserDim and CompanyDim surrogate keys association (consequence: one more fact table that complicates nice star schema)

How do you guys approach this problem, which seems like a basic data warehouse modelling issue (I haven't found a discussion of this in Kimball).

Thanks for you input.
 
A fact table would store the surrogates for each dimension as needed for a specific point in time.

If each user is in a company has a "transaction" at a point in time then wouldn't a fact table give you the relationship between a user and company at a specific point in time?

 
What if the "main" facts occur at a much lower frequency than changes to user-company relationship? For example, user changes from Company1 to Company2 to Company3, but there are only facts happening when user is at Company1 and Company3. You would never know that user ever was at Company2.

This particular example maybe a contrived one, but the general principle is what counts...
 
It is clear from your dialog that there is a dependency between User and Company - that is, they are not independent variables and hence not separate dimensions.

You have also noted that it is significant that the knowledge that User was at Company 2 even if no facts occurred during that time. This means that the assignment of Users to Companies is probably worthy of representation in the Data Warehouse.

I suggest that you have established the need for an associative (or relation, or bridge) table between User and Company with the start and end dates of employment as part of the key. So, aside from resolving the many to many relationship, it also provides a "factless" fact table by tracking the event(s) associated with Users and Companies - namely start and end employ dates.

There are often cases where dates of certain occurences become needed in "factless" fact tables, or, as I prefer to call them, event tables.


-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top