Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Detecting changed attributes in a dimension table 1

Status
Not open for further replies.

bsellick

Programmer
Jul 26, 2001
24
CA
In the latest Data warehousing Toolkit book, Ralph Kimball suggests using a CRC algorithm as a fast way of checking if one or more attributes of a dimension have changed.

I have no idea how to implement a CRC process. Does anyone know? What other methods do people use to detect if an attribute has changed (well... other methods faster than comparing each incoming attribute against the existing dimension table).
 
CRC is usually a function available within your ETL tool (or perhaps your database). CRC stands for Cyclical Redundancy Check. It's been used in data transmission for years.

Essentially what it does is scan the bits in a string and calculate a number based on some neat little formula. That set of bits will always produce the same calculated number and IN THEORY other patterns of bits will produce different numbers. So if anything changed in the string (or row of data in our case) a different CRC number will be calculated.

The "in theory" part is that sometimes the calculation misses a change. Make sure you use a 32 bit CRC and not a 16 bit to minimize this chance.

The basic technique is to construct some calculation that concatinates together the fields you want to check for changes and do the CRC against it. Then store the CRC number on the database with the dimension row.

Next load, do the same thing for all incoming rows and compare the new CRC for that dimensional member against the one on file.

This sounds very cumbersome and slow but I understand CRC checks are very fast to calculate. Try it out and let us know.

Good luck :)
 
Thanks MattOh.

I'll have to look at both the ETL tool and db (Oracle). Will probably be some time but I will update the thread if I remember.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top