Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Question about Bridge Tables in star schema

Status
Not open for further replies.

spshealy

Programmer
Jan 14, 2005
1
0
0
US
I have a question about brige tables in star schemas. Most of the dicussion I have seen about(Kimball centric) bridge tables has weighting factor included into the bridge tables(which is usually set to inverse of the number of member of the set). These weighting factors are used to adjust sum queries so that the facts are not n(double triple etc) counted/summed in queries.

i.e.

select
d.field,sum(f.fact1 * b.wt)
from
dimension d,
fact f,
bridgetable b
where
f.dim_bridge_fk = b.sk and
b.dim_fk = d.dim_sk


Now my question is... it seems that if in your query you constrain the dim table on a field it will mess up your sums

i.e.

select
d.field,sum(f.fact1 * b.wt)
from
dimension d,
fact f,
bridgetable b
where
f.dim_bridge_fk = b.sk and
b.dim_fk = d.dim_sk
and d.field1 = 'somevalue'

In this case it is possible that the weighting is no longer accuarate b/c I have potentially limited the number of dim_records that make up a set in the bridge table. Since the weight is static and usually equal to the inverse of number of members of the set , I suspose you can't contrain the dimension table and get accurate sums..

Is there some way to calulated this weighting factor dymically so that you can get accurate sums and counts when you query constrains on the dim_table?

Thanks
Scott


 
Before starting the query, take the inverse of the inverse giving you the number of members in the set. Then, assign a query variable to the number of rows returned (for counts) or to the sum of a measure of interest (for sums). Between these, you can should be able to determine the weighting at query time.

-------------------------
The trouble with doing something right the first time is that noboby appreciates how difficult it was.
- Steven Wright
 
The purpose of a bridge table is to resolve the many to many relationship between fact and dimension. Given that scenario you can either report out at the lowest level or highest level. Assigning a weighting factor (based on number of occurrence of values in the group on the bridge table) allows you to report at the lowest level.

By ignoring the WF and taking a MAX you can show the max at an aggregate level (highest level).

Dynamically changing the factor and the extended measure (assuming dollars) would result in the same information for a specific occurrence on the bridge table to show different values on different queries for the same point in time and for those of us that do data warehousing for a living this is not a good thing.

example
claim dollars for a specific doctor visit = $100
for the claim there are 3 associated diagnosis codes
therefore the bridge table would contain 3 rows
Claim# DC1 WF=1/3
Claim# DC2 WF=1/3
Claim# DC3 WF=1/3

Report through the bridge shows
Claim#
DC1 Amt = 100/3 = 33.33
DC2 Amt = 100/3 = 33.33
DC3 Amt = 100/3 = 33.34 (assume last one smoothes the rounding)
The individual amounts in this case are arbitrary and it the aggregate number that is accurate $100.

If I now limit my DC to DC1 and DC2 then the report shows
Claim#
DC1 Amt = 100/2 = 50
DC2 Amt = 100/2 = 50
again the indivdual amounts are arbitrary and the aggregate number is correct $100.

hope that helps .. is a solution not based in reality a solution at all?



hope that helps
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top