Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Populating the Time Dimension Table 2

Status
Not open for further replies.

Jony77

IS-IT--Management
May 4, 2008
9
LU
Hello,

I am still working on my first trial Data Warehouse. In the Time dimension, I planned to cover data from 01.01.2006 till date on hourly basis and I have the following levels; Year, Month, Quarter, Week, Day, Hour, Holiday_Flag.

My question is how do I at best achieve that - populating the time dimension table?

Thanks,

Jony
 
Either via a program or loading an Excel spreadsheet. Program would be less prone to error. Easiest was is take a starting date/time like 01.01.2006 00:00 and use a SQL function to add one hour and INSERT the row. Continue in a loop until you have reached however far in the future you want to populate. Functions such as DATE_ADD. If your DB does not have such functions, you'll need to use integers to carry the YYDDD and HH (yes, use Julian as it's easier). Then run a conversion on the integers to date/time at the time of INSERT. When you increment the hour by one, you can increment the sequence by one, if you don't have built in sequences.

-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
 
Hmm, this is a borderline case where you can either opt for 1 dimension as you propose, or use 2 dimensions: one for date and one for time.
Depending on the reporting requirements You should consider additional values in the dimension: like monthname, weekday, etc.

Populating by means of an SQL script would be my solution. You can google for public/bank holidays and incorporate this in the script.
 
Sorry for chiming in late on this but I have been busy relocating.

I always seperate my Date and Time into seperate dimensions. To me this allows easier analysis of key metrics that are more useful across time. depending on the data and the industry analysis on time may not be of much use. In the casino industry the resort wants to know what people are doing hour by hour or even minute by minute where it may be less useful for say insurance quotes.

I have a standard script I use that does exactly as john has suggested. and can be ran for either the span of a historic load useful for first time population or on a daily basis so I don't have 2 years of dates prepopulated.

I have also found for me that in the MS SQL world FLOOR(CAST(date as float)) makes a great surrogate key and is blazing fast.

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top