Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extended Star Schema

Status
Not open for further replies.

138006

IS-IT--Management
Dec 30, 2003
101
0
0
IN
Hi
The scenario is,
I have 3 dim tables say D1,D2,D3 wih a Fact table F1 say. Now I have another table D11 say which is snowflaked with the D1 table.

Now my requirement is to have a SQL of the form

Select
D1.a1,
D2.a2,
D3.a3,
sum(F1.f1),
sum(F1.f2),
min(D11.a4)
from D1,D2,D3,D11,F1
where
D1.d1=F1.d1
and D2.d2=F1.d2
and D3.d3=F1.d3
and D11.d4=D1.d4
group by
D1.a1,
D2.a2,
D3.a3

How to do this? Note that D11.a4 is a varchar. If I create a fact on D11.a4, MSTR expects D11 as a fact table, tries to join D1,D2 and D3 with D11 resulting in incompatible SQL.

I can's create Min(D11.a4) as an attribute in report cause that will include the Min(D11.a4) in the group by clause which is incorrect.

Do I need to create an attribute on D11.a4 and then create a metric on it and use it in report?

If I have to use it a s a metric (based on an attribute or fact) then what level do I have to set for this metric?

Please help

Thanks

 
You don't need to change your facts. f1 and f2 should remain unchanged on F1.

You need to create an attribute on D4 and make it a parent of the attribute on D1. Just pop the attribute on D4 onto your report and it will use it.
 
hi
To add:

D11 table has no PK. Its joined to D1 on a column with cardinality 1:1. So its not an ideal snowflake/normalized table.
Even then will creating an attribute on D4 and making it a parent of the attribute on D1 work? As far as I know we do so if there is 1:n relationship.

Also what will happen to min(D11.a4)? Will it be an attribute or a metric?

Please let me know
 
D11 has a PK... It's just the same PK as D1, right?

When you specify the parent-child relationship, you can specify one-to-one. Not that it matters, one-to-one is the same as one-to-many for most circumstances and treating the former as the latter will never yield an incorrect result.

Why do you need min(D11.a4)? If you need that field is just a description field for one of your attributes and you don't want it in your group by clause, take a look at your VLDB settings.

If it's a legitimate min function, then model D11.a4 as a fact and create a min metric on it.
 
No, D11 doesn't have PK. There is a non PK field in D1 which is joined to D11.

What exactly will I see in VLDB? Please let me know your direction of attacking the problem.

If I use just D11.a4 in the report (no min) then I can find that the metric values are multipled by 4 times than whats actual. This means that the entire set is duplicated 4 times and the metrics have been aggregated thus.

please advise
thanks
 
Is the field that D11 is joining on to D1 unique on D1? If not, you will get duplicate records when you join, which may be what you are after.

Just map a fact to D11.a4 and create a metric that is the min of that fact.
 
hi
If I create a fact on D11.a4 and then a metric which is min(fact), MSTR expects D11 as a fact table, tries to join D1,D2 and D3 with D11 resulting in incompatible SQL.

D11 will be joined to D1 only and not all dim tables. But defining afact on D11 makes MSTR SQL engine treat D11 as a fact table.ANd it tries to join the other dim tables in the report with this resulting in cartesian join.

This is not what we want.
Thanks



 
Change the dimensionality on the metric. Remove the ReportLevel entry and add the D1 attribute using standard/standard.
 
Hi,
Since the metric is on a fact , it creates a separate SQL

Select
min(D11.a4)
from D11

It then joins this SQL with other SQL having other facts. As you can see, this SQL will return one min value for the entire set. However I want a min for that attribute set.

How can I get the min in the same SQL as the rest of the facts?

Also, in the dimensionalty of the metric should I add all the attributes present in the report (basically all attributes of the Group By clause) or just the one which joins this table with another?
Please help.
 
138006, here's my suggestion for how you should set up your model to "recreate" your SQL, although I always recommend that one shouldn't make MSTR reproduce existing SQL.

Attribute A1, ID = D1.d1
attribute A2, ID = d2.d2
attribute A3, ID = d3.d3
attribute A4, ID = d11.d4, DESC = d11.A4. Make sure D1 table also shows up in the attribute editor for A4. Make A4 parent of A1.

update schema.

To get your min(D11.a4), create metric M4 defined as min(A4@DESC) This will force the sql to do min(d11.a4).

Then on your report place A1,A2,A3 and the metrics. You may get an additional pass of SQL with a small cross join between D2, D3 and D1, but the answers should be what you are looking for. I suspect the 2nd pass will be faster than a single pass because you eliminate a very expensive join between D11 and D1 for every fact table row.
 
Hi
The crossjoin you are referring is not small. My actual SQL is:

select a17.US_RELEASE_NAME US_RELEASE_NAME,
a13.REGION_DESCRIPTION REGION_DESCRIPTION,
a16.FISCAL_PERIOD FISCAL_PERIOD,
a15.LICENSOR_TYPE_DESCRIPTION LICENSOR_TYPE_DESCRIPTION,
a14.FISCAL_PERIOD FISCAL_PERIOD0,
a13.LICENSEE_NAME LICENSEE_NAME,
a13.IVS_COUNTRY_CODE IVS_COUNTRY_CODE,
a13.IVS_COUNTRY_DESCRIPTION IVS_COUNTRY_DESCRIPTION,
a12.CATALOG_ID CATALOG_ID,
min(a11.ROYALTY_FORMAT_CODE) WJXBFS1
from ROYALTY_FORMAT_MAPPING a11,
CATALOG_DIM a12,
PARTY_MARKET_DIM a13,
TIME_DIM a14,
DEAL_DIM a15,
TIME_DIM a16,
RELEASE_DIM a17
where a11.MEDIA_FORMAT_CODE = a17.MEDIA_FORMAT_CODE
and (a11.ASPECT_RATIO_CODE = 'P/S'
and a13.ENTITY_DESCRIPTION = 'Licensee'
and a13.ENTITY_TYPE in ('L', 'S'))
group by a17.US_RELEASE_NAME,
a13.REGION_DESCRIPTION,
a16.FISCAL_PERIOD,
a15.LICENSOR_TYPE_DESCRIPTION,
a14.FISCAL_PERIOD,
a13.LICENSEE_NAME,
a13.IVS_COUNTRY_CODE,
a13.IVS_COUNTRY_DESCRIPTION,
a12.CATALOG_ID

As you can understand that this takes a lot of time. Any alternative and feasible solution to get rid of the Cartesian join???
 
yes, there's a way you can avoid the cross join and still get the same results.

In the M4 metric, for its dimensionality, remove "report level", and add Attribute A4 in it.

Now your SQL should look like

select a17.US_RELEASE_NAME US_RELEASE_NAME,
min(a11.ROYALTY_FORMAT_CODE) WJXBFS1
from ROYALTY_FORMAT_MAPPING a11,
RELEASE_DIM a17
where a11.MEDIA_FORMAT_CODE = a17.MEDIA_FORMAT_CODE
group by a17.US_RELEASE_NAME,

The WHERE clauses will be applied in the second 2nd pass of SQL.

Your report results will be the same.
 
Hi
Please note that the min should be grouped by against

a17.US_RELEASE_NAME,a13.REGION_DESCRIPTION,a16.FISCAL_PERIOD,a15.LICENSOR_TYPE_DESCRIPTION,a14.FISCAL_PERIOD,a13.LICENSEE_NAME,a13.IVS_COUNTRY_CODE,a13.IVS_COUNTRY_DESCRIPTION,a12.CATALOG_ID

and not against a17.US_RELEASE_NAME alone.
So I think ur suggestion will give wrong data.
Thanks
 
here's my logic for why it will work. for each a17.US_RELEASE_NAME, your min() is the same regardless of all your other grouping variables. In other words I think your min metric will look like this

min d1 d2 d3
--- -- -- --
10 1 2 3
10 1 2 45
10 1 2 466
10 1 3 55
11 2 2 4

So the first pass will calculate the following table

min d1
--- --
10 1
11 2

I recommend you try it, it is easy to create the query. If it doesn't work, then I stand corrected.

Also if it works, you can try both the single pass version (the one you wrote), and the multipass version (generated by MSTR) to see which one is faster...
 
138006, any news here. Did you get it to work?
 
The cross join as u can guess takes a lot of time. This is not acceptable to the client. We have to think some alternatives. But I am rather surprised since this seemed a rather common requirement.
 
did you try my solution without the cross join, posted April 16, 2003?

The query grouping will be done in the last pass, so there is no cross join.
 
This should be close to a cake walk IMHO, assuming a4 is a unique column name.

Do as nlim suggested, but without A4 DESC (don't need):
Attribute A1, ID = D1.d1
attribute A2, ID = d2.d2
attribute A3, ID = d3.d3
attribute A4, ID = d11.d4 Make sure D1 table also shows up in the attribute editor for A4. Make A4 a parent of A1.

Then make facts and metrics accordingly. assuming Metric sum(F1.f1) as M1. For min(D11.a4) metric, use the formula
"applysimple("min(a4)", M1)"

Put everything in a report, with a dummy filter defined as
"id@A4 is not null"

You expected SQL should be:

Select
D1.a1,
D2.a2,
D3.a3,
sum(F1.f1) wfxuf1,
sum(F1.f2)wifyno1,
min(a4) wixfno1
from F1 a11, D1 a12,D2 a13,D3 a14,D11 a15
where
D1.d1=F1.d1 -- should apply alias but i won't bother
and D2.d2=F1.d2
and D3.d3=F1.d3
and D11.d4=D1.d4
and D1.d4 is not null
group by
D1.a1,
D2.a2,
D3.a3


 
One clarification:

WHat does this suppose to do:
"applysimple("min(a4)", M1)"

 
138006,

There is another alternative to solve this which should avoid cross join.

1. I presume that there is a relationship exist linking D11, D1 and the fact table F1 through column d4 and d1...

2. Create a dummy fact in F1 called DummyFact, such as (1).

3. Change the metric definition for Min(D11.d4) to Min (D11.d4*DummyFact).

Then you should see this SQL

Select
D1.a1,
D2.a2,
D3.a3,
sum(F1.f1),
sum(F1.f2),
min(D11.a4*1)
from D1,D2,D3,D11,F1
where
D1.d1=F1.d1
and D2.d2=F1.d2
and D3.d3=F1.d3
and D11.d4=D1.d4
group by
D1.a1,
D2.a2,
D3.a3

Let me know if this works. Thanks,

Phoenixier
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top