Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Assigning datetime dimension ID's during ETL

Status
Not open for further replies.

jakewoods

Programmer
Sep 16, 2009
3
US
Hello all,

I am an experienced DBA but am kind of new to data warehouse thinking, so please excuse me if this question is really basic.

I have an OLTP time tracking system I am converting to a DW schema. The system measures time in 15 minute increments, so that is the increment I have used for my date/time dimension table (each row represents a particular quarter hour).

My problem is that not all of the incoming data is that precise (for example, employee hire date has no time component) and I would like to use the same date/time dimension table across the board.

I am past the point where I can separate dates from times into 2 separate dimensions. What I thought of was just assigning a default time (midnight seemed reasonable) to all the "date only" information I am transforming. Is there some obvious reason this won't work? Is there a different standard way for solving this problem?

Thanks,

- Jake
 
I think you might be missing the point. Your 'facts' or 'measures' are the time tracking times. The other dates are part of the date dimension. If you have noted that your facts are actually "events", and are not satisfied with a "Factless Fact Table", they you could make two dimensions, one for date and one for time.

It's important to determine what you are measuring (facts or 'events') and across which aspects you are measuring (dimensions). Hope this helps.

-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
 
OK, I may need to rethink my table design. Thank you.

What do you think about a more general version of the problem though? If you need to measure something at two different levels of detail, do you always need to have two different dimension tables (i.e. date and time) or can you have one (datetime) and have some algorithm for pulling out the right measurement at the higher level? I would think that the problem comes up a lot, like states and counties, or really anything with a rigid hierarchy.
 
Here are my thoughts on the particulars and the general situation:

1. I always have separate Date and Time of Day dimensions. If there is anyway you can re-engineer your Date dimension, then I encourage you to do so. For facts without a time component, the associated fact table would not store a key for the Time of Day dimension.

2. If you stick with your combined Date/Time of Day dimension, then what you can do depends on the toolset being used. For example, Microsoft SQL Server Analysis services allows you to connect measure groups (fact tables) and dimensions at different grains. So you could connect some measures groups at the lowest level (15-minute increments) and other measure groups at the date level.

3. If you're not using a tool like SSAS, and you're utilizing a ROLAP solution with a single, denormalized Date/Time of Day dimension, then I don't see anything wrong with using the Date/Midnight dimension record for your facts which have no time of day. Yes, I suppose there is a small chance that you could confuse your users just a little bit if they think that xyz only occurs at midnight, but I think that would be a very remote chance.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top