Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sequential Keys 2

Status
Not open for further replies.

MarleneD

Technical User
Jul 4, 2003
72
CA
What are these exactly?

I recently read an article by Kimball on these but still do not have an understanding of how and why they are used.
 
are u talking about surrogate key ?
Because surrogate keys are sequential in nature so are also called squential key.
let me know if this is what you are intending , if yes i will give you explanation on surrogate key.
 
surrogate keys do not have to be sequential. They are also known as non-intelligent keys and are keys assigned to records independent of the production code i.e. a product with a SKU of 11234 would be assigned a surrogate key independent of the code 11234 and this would be stored as the unique key in the warehouse for this record not the production code.

Why ? One big reason is that if you have slowly changing dimensions then the production code would not maintain its unique status as it would need to be stored in the dimension table twice - original record and the updated record with new attribute.

Hope this helps
 
Theortically the surrogate key need not be sequential but when you implement DW using surrogate key they are generally sequential. The advantage is that the look up coding become more easier and a simple query of max surrogate key or the surrogate table will give you the number of records in the dimension tables.
Ok now about surrogate keys. According to RK the design of datwarehouse should be isolated from changes made to the operational data source. So if you have a product dimension and the source has a product master with product no as primary key. The dimension table should not have productno as it key but instead a alias key (surrogate key) should be generated for each unique record in the operational system. now in the example i have given the information may look redundant but it is very useful in case the operational system has composite primary key and as mentioned to capture history of slowly changing dimension values.
 
This may be a dumb follow-up question - please excuse my nievity on this subject.

How frequently are surrogate keys used in data warehousing?

Are surrogate keys a recent introduction?
 
We always use surrogate keys when creating new fact and dimension tables. We had some old tables and ETL code around that didn't use them, and we haven't got around to retrofitting. As far as I can tell, surrogate keys started to turn up around 1997 - 98, though I really haven't done an exhaustive search. Kimball didn't talk about them in his earlier book - The Data Warehouse Toolkit, 1996. But they are in his Lifecycle Toolkit, 1st edition, 1998. I think they were mentioned in magazine articles before that.
 
The look up code becomes redundant if you implement your surrogate keys as digests of the reference data primary key.

Why ? when applying the transaction it is a case of doing the equivilant digest to the fields on the transaction - hence no need for the look up

(digest function I have written uses a standard MD5 aglorithm available in Oracle 8i + to produce a fully reproducible key for the inputs)
 
Bigsacks:
What is a digest function, could you please give an example.
Thanks,
Rohit
 
Hi, surrogate keys are easy to generate both in Informatica and Datastage, and yes, thery are important for slowly changing dimension. By the way, in which way can I realize the slowly changing dimension in Datastage?
Thank you for your attention.
Bye.
 
Guys

Here is the digest function.

CREATE OR REPLACE FUNCTION DIGEST.OMD5
(
p_string IN VARCHAR2
)
RETURN VARCHAR2
AS
v_string VARCHAR2(16);
v_hexstring VARCHAR2(32);
BEGIN
v_string := dbms_obfuscation_toolkit.md5(input_string => p_string);
v_hexstring := rawtohex(utl_raw.cast_to_raw(v_string));
RETURN v_hexstring;
END;

Creates a reproducible 32bit String - great for finding dups also !

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top