Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

MS Olap w/ Oracle Datasource 1

Status
Not open for further replies.

SteveQL

Programmer
Jul 6, 2001
15
US
Hi,
I am in the process of building some cubes using Analysis Services that utilize a star schema in an Oracle database. A problem I've encountered occurs when I try to process the cube. Apparently the SQL that is passed to Oracle contains an Order by clause, when Oracle tries to do the sort, it wants to dump the entire data set into temp space and then do the sort. My problem arises because it is what I would characterize as a very large data set (in excess of 250 Million rows) As you might imagine I run out of temp space pretty quick and the cube processing fails.

Is it possible to control the SQL that gets passed to Oracle and eliminate the sort? Or am I just going to need a ton of disk space allocated as temp space.

Thanks!
Steve

 
I don't believe you can do anything about the SQL statement AS spawns. You may consider partioning your cube so that the size of the dataset is smaller in each partition. Does the cube contain a distinct count measure? If it does contain a distinct count then you will really want to reduce the amount of data being processed at any given time.


Paul
 
Thanks Paul,
Well, heck, I bet a partiton would do the trick... I only have 3 dimensions and luckily, one of them is not a changing diminsion... sounds like I have an easy choice for a data slice... Thanks a bunch Paul, I think you just handed me the solution! And yes, by the way I do have a distinct count measure so a partition makes sense for a variety of reasons!

Cheers!
Steve
 
FYI

If you have multiple measures in this cube and have not already done so, I would also suggest making a cube identical in structure to this one that contains only the distinct count measure, then bring these 2 cubes togther in a virtual. Distinct count measures affect aggregations in an adverse manner when in a cube with other measures. The pain about distinct count measures is that they are processed in memory not on disk like other measures. The number or records and the amount of memory available to Analysis Services have an impact on the number of aggregations a cube with a distinct count measure can have and the amount of time it takes to process them. My experience when working with Distinct Counts has always been the smaller the working slice of data the better.

Paul
 
OK, so If I'm understanding you, I should create another cube that contains my distinct count as it's only measure... same diminsions, partitions etc... then join the two in a virtual cube. Does this put me in a situation where I know have two process two cubes of the same size and have therefore doubled my processing time?

Also, I noticed that when I started building the partitions, that the origianl partition (the whole cube) is still there. When I go to process, will it process each partition sequentially or in parallel. Remember that the main thing I'm trying to accomplish here is to feed Oracle SQL that will return a smaller dataset.
 
Yes you will have 2 cubes to process but you may notice that each cube is smaller in Size than the original single cube and the combine total size is smaller than the original as well. As for processing the time will probably be less than the single cube. I had a cube that took 45 minutes per partition to process when I broke it out to 1 cube for distinct count and one for all other measures the combined processing time was less than 10 minutes. Naturally I grabbed half that time back for aggs. Analysis Services only processes sequentially but there are tools that allow parallel processing. Depending on the size of your server you could parallel process these 2 cubes with out an adverse affect to ORACLE. If you do parallel process you want to avoid parallel processing of the distinct count partitions.

To get rid of the original partition you could point it to a view that you have the oracle equivalant to "where 1 = 2" (Sorry I do not know PL-SQL so I am not familiar with it's syntax) and then reprocess the empty dimension which will then be empty.

Paul
 
Cool!
I do have a question about the original partition, You said point it to a view that is a "Where 1=2" so I'm assuming that really what I want it to do is point it to an empty set of data. Now I'm partitioning on a diminsion of products that are assigned categories (the highest level of the hierarchy) Could I point the original partition to where the Category Name is Null? That way hit an empty or extremely small dataset? This would help me to identify data mantenance issues as well.
 
That would also work. The goal is to remove the prexisting data.

Paul
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top