Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Rule for Partition

Status
Not open for further replies.

Shamli

Programmer
Oct 1, 2008
5
US
Hi All,

We have 6 Mln rows in Fact table with 100+ measures. As of now we have 3 partitions with following config

Part1 : 1,834,099 with 0%Aggregation MOLAP
Part2 : 3,092,143 with 30%Aggregation MOLAP
Part3 : 1,105,211 with 30%Aggregation MOLAP

We are using SQL Analysis Service 2005 on 64Bit Server with 2CPU and 8GB RAM. We are using VMWare.

Question1: All our measures in FACT tables are related, is it good to have single FACT table or we should some how split it into multiple fact tables.

Question2: Considering our system configuration, should we create more partitions. How much should be size of a partition and how is it related to system configuration. I read on Microsoft website that a partition should not have mroe than 20 Mln rows but could not find proper RULE/FORMULE to divide partitions.

Question3: At this moment CUBE PROCESS (FULL)take 30 to 40 mins. Is it right time or can we get better performance if we increase hardware CPU/RAM. How much should be ideal CUBE PROCESS time for 6Mln rows?

Question4: I read somewhere if design is OK, we should be able to process 2-5Mln rows per minute. Is it true. With our hardware how mich time it should to take to process cube with 6LMN rows in Fact and 10 Dimensions.

Dimensions are not big, except one Dimension with 100,000+ and 2nd with 5000 memebers, others are less than 2000.


Thanks in advance.

 
Corection

Our Fact table has 200 measures not 100 and all are related.

 
when you say the measures are no related, are they not related to eachother or not related to all dimensions joined to that fact?

200 Measures does seem like quite a lot especially for a single measure group.

Most people start by partitioning the data across an element of time such as Month depending on the amount of data they may go lower say to the day level or even involve another dimension in the partition Say Month & Product Category.

How frequently is your cube loded?
What is the partioning strategy in your data mart?
What is the relationship of your measures to one another?
How man months/years of data do you have loaded?


Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
I mentioned all MEASURES are RELATED so it may not make sense to break Fact table but still if breaking Fact Table may help than we can consider that too.

I have partition on base of date whcih is working fine. I want to optimize our environemnt and trying to figure out if we can do somthing better and hense all my questions in original email.

It will be really helpfull if you could reply to questions.

Answer to your questions:
How frequently is your cube loded?
Ans: It is loaded 4 times in 24 hours. 3 times during day and once time in late night

What is the partioning strategy in your data mart?
Ans: We donn't have any partition on Fact table in warehouse, if that was your question. Do we need to create partition on FACT table and what should be STRATEGY/RULE.

What is the relationship of your measures to one another?
It is hard to explain, it may be possible to group them up in separate FACT if need. Again question is, is it requried? I read on MS website that recommendation to KEEP related MEASURES in single FACT table. Each MEASURE explain some aspect of our lowest GRAIN in FACT and it is possible to use any set/group of MEASURES in query/report. We are using 3rd party tool called Tableau against our cube.

How man months/years of data do you have loaded?
Starting from 2003 till June 2007 we were only saving month end data but since JUly 2007 we have DAILY data also. Over all we have 400 dates in Time dimension till date.
 
What is your cube partinioned on? You have 3 partitions but how are the sliced?

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
oops!

Also what is the Date/Time grain of your data? both in the DW and in your cube.

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
What is your cube partinioned on?
Cube is partitioned on date, for example

Partition1: Most recent 200 dates
Partition2: Most recent 201 to 300 dates
Partition3: Older than Most recent 300 dates.

How it works: During warehouse load we update a field called [DaysDiff From IsRecent] which has DAYS difference between given date and current date for instance
For today: [DaysDiff From IsRecent] =0
For Yesterday: [DaysDiff From IsRecent] = 1
etc..

In both warehouse and cube DATE/Time grain is DATE so we DON'T have HOURLY data. LOWEST time factor is DATE and than Month and than Quarter and at last YEAR.
 
So records can move from Partion 1 to partition 2 over time?

If this is the case then you should revisit the design. There is really no point unreloadin and unloading records as they get older. You can partition your cube by month and filter the base query used to process the partition where Date Between first and last or if you have implemented a proper date dimension you can say where date_sk between x and y.

how long does it take you to do a Select * From Table where IsRecent = 1 ?

If it does not yet exist you may want to put an index on this column.

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
We have INDEX on field used to filter partition.

Our historic data also get UPDATED daily so we are doing "FULL PROCESS" daily.

Question1: Is there any better option other than "FULL Procss" considering the fact our historic data get UPDATED daily. According my understanding INCREAMENTAL update wouldn't work in our environment.

Question2: Is it still BAD to move records between partition if we are doing "FULL Process"

So you are suggesting it is BAD design to let records move from one Partition to other. I was not aware of this fact, thanks so much for pointing it out . Could you please give specific example to partition our Fact table in cube. I couldn't understand what do you mean by

"You can partition your cube by month and filter the base query used to process the partition where Date Between first and last or if you have implemented a proper date dimension you can say where date_sk between x and y. "

My Time dimension has following fields

TimeKey int IDENTITY,
ImportDate_Org datetime NOT NULL,
YearNo int NULL,
QtrNo int NULL,
Qtr varchar(2) NULL,
MonthNo int NULL,
MonthName varchar(9) NULL,
Weekday varchar(10) NULL,
MonthEnd varchar(5) NULL,
QuarterEnd bit NULL,
YearEnd bit NULL,
[Is Recent Single Import Date] bit NULL,
[Is Recent Single Import Date -1] bit NULL,
[Is Recent + Trailing 3 Month Ends] bit NULL,
[Is Recent + Trailing 12 Month Ends] bit NULL,
[Is Trailing 365] bit NULL,
[DaysDiff From IsRecent] int NULL

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top