Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

22 tables about "Codes"

Status
Not open for further replies.

A3083131a

Programmer
Sep 24, 2010
3
CA
I have 22 tables with two fields (1- Codes, 2- Description) in each of them. I am building a star schema.Is it possible to make them dimensions around Fact? There will be 35 Fks all together in the fact.
Or Make a junk dimension of all the Codes tables? How to do that then.
Please advice
Thanks
 
If I understood correctly, you want to create a generic dimension table so that you don't have to maintain 22 separate ones. Other than that advantage, I do not think there is any other benefit

I do something similar in my cleansing/translation phase. I use the generic table to validate common-codes coming from a feed. I also have a translation field that specifies an equivalent common-code in the event that an old code is needs to be translated to a one.

Your generic table will require a field that identifies the dimension table it represents, such a PRODUCT, SUPPLIER, REGION, etc. I called my field common_code_entity_key

My generic table looks like this:

SQL> desc common_codes
Name Null? Type
----------------------------------------- -------- ----------------------------
COMMON_CODE_ENTITY_KEY NOT NULL VARCHAR2(30)
COMMON_CODE NOT NULL VARCHAR2(30)
COMMON_CODE_DESC VARCHAR2(60)
COMMON_CODE_TRANSLATION VARCHAR2(150)

The common_code_translation field is typically null unless a code needs to be translated - in which case I can specify the new code or a regular expression. The regular expression my be useful if for example, I want to remove punctuation from a phone number or Canadian postal code.

I hope this helps and provides you with some ideas.

 
I don't like the generic codes table solution because you end up with twenty two role playing dimensions.

My first suggestion would be to work with your users to model the data -- instead of seeing that you've got twenty-two codes tables in your source system and just slapping them into your schema, find out if these codes have any business meaning. You might end up with several dimensions with attributes and hierarchies other than just your codes and descriptions.

Then, depending on what is left over, I would possibly create a junk dimension. One thing I don't like about a junk dimension is that it leaves something to be desired for browsing by the end user. It's not very intuitive to open up the "Miscellaneous" dimension and scroll through a bunch of unrelated attributes.
 
@dkyrtata:
Please correct me if I am wrong, basically you are saying that I should have a dimension table that will have 22 records in it. Every records will represent each Code table of the source.
Have I understood you correctly?
 
A3083131a

Yes that is basically what I said.

Although it does not make a difference, I did not realize that you only have 22 records. I thought you had 22 tables, each having many records inside your generic code table. So your SUPPLIER table (i.e common_code_entity_key='SUPPLIER') could have 10,000 records, and your PRODUCT table(common_code_entity_key='PRODUCT')could have 100,000 records. But with an ever-growing number of records, your generic table will lead to performance problems.

With only 22 records however, it sounds that you can easily back out of this approach if it proves to be problematic.

What kind of data will you be storing. Is it for end-user reporting or is it meta-data to support your load-process?
 
I am using it for end user reporting.

You thought right at first place that I have 22 tables that have onle two columns:
1-Code
2-Descriprion

But in each of these codes and descriptions there are thousands of records. So look like your solution will not work here.
I was thinking to make a junk dimension and include all of them in it (that's what I read on kimball's book), but how to do that? I don't know. The answer of this I am looking for.

 
An alternative would be to use one physical table with views or (Oracle) materialized views for each of the dimensions. Again, not sure what this gets you. I can't see where there would be that much performance gain. Small tables are usually best read by a full table scan, so you might gain some performance by having FK's but not putting indexes on them.

-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top