A troublesome MDX-Statement 1

drunkenwallaby · Apr 27, 2005

Hi everyone.

I'm having trouble to get a MDX-statement to work and I hope somebody around here may point me to some helpful tutorials or similar stuff.

To start with, I'm a statistican who is quite familar with SQL queries but I've not managed yet to create a MDX-statement to calculate a specific computed measure. Let me explain shortly the setup of my (very simple) data-cube. It consists of only 6 dimensions. Those are:

- region (9 parameter values, star-scheme)
- age (10 parameter values, star-scheme)
- education (4 parameter values, star-scheme)
- nationality (4 parameter values, star-scheme)
- labour-force status (4 parameter values, star-scheme)
- sex (2 parameter values, star-scheme)

The population is divided in a lot of subgroups, which means every possible combination of parameter values of the dimensions may be considered as such a subgroup. Example: Subgroup "d" is defined as all objects, that live in region '1', their labour force status is '2' and their sex is '1'.

With data coming from a sample survey, the total domain size N_d in any of the possible subgroups G_d is estimated with the help of some (known) weights w. To make a long story short, the values in the cells of the data-cube are point estimates \hat N_d for N_d, the total domain size of group "d" in the population.

My task is now, to compute a variance estimator with the following formuar (1) by using a MDX-Statement. The variance for a small domain "d" within a specific region "r" is given by:

Var(\hat N_d) = \sum_{r=1}^{9} (N_r)^2 * (1-f_r) * (pd (1-pd))/(n_r-1) (1)

where
- N_r ... total population size (estimated domain size for region r)
- n_r ... sample size for region r
- f_r ... "sample fraction", which can be calculated as: n_r / N_r
- nsd_r ... number of sample objects in the selected subgroup "d" in region r
- pd ... nsd_r / n_r

My problem is that I have no clue how to create an MDX-dtatement that calculates those values simultanely for all the cells that are returned as a result set from the data-cube when it is processed. The biggest problem that I have is, that I don't know if I can use information stored in a table that is neither a dimension, nor a measure to compute calculated measures. If this was possible, I think implementing formula (2) isn't that easy. I could store all the constants in a separte table and most of the work would be to get the number nsd_r out of the cube...

I'd be very glad if someone has some hints for me. Really, and I'd happily pay lots of beer of stuff for the one who guides me to a successful MDX-statement

MDXer · Apr 27, 2005

From what I am seeing you are trying to accomplish things your cube is not designed to support. Or you trying to get the information in a way that the cube can't deliver.

Lets take an easy one. Total population size. This should be a measure computed based upon the Selections of your dimensional elements.

Sample size for region you can carry as a dimension member attribute. Given the assumption that Region A is always going to have a sample size X regardless of the dimensional selections.

Sample Fraction would then be

Region.CurrentMember.Property("Size")/Population Size

To do the Sub group calculations you may want to examine adding or modifying dimensions that allow subgroub s to be used outside of the logic. If not you'll Have a ton of IIF() logic to get your sub group values regardless of your dimensional elements.

A second alternative may be to create a seperate cube which has your subgroup diemnsion along with a number of other items (I don't know which as this is simply a brainstorm) you could the bring both of these cubes into a virtual cube.

probably not what you wanted to hear

Shoot Me! Shoot Me NOW!!!
- Daffy Duck

drunkenwallaby · Apr 27, 2005

Hi MDXer! First of, thanks a lot for your detailed and (in a way

) helpful answer.

It just came in mind, that I could do some dirty hardcoding which similifies the variance formula by much. it would be (considering just 2 regions)

Var(\hat N_d) = constant_1 * (pd_1 * (1 - pd_1)) + constant_2 * (pd_2 * (1 - pd_2))

since I can calculate the values N_r, n_r, f_r before working with the cube. The only values not known are the nsd_r, that I need to calculate pd_r.

I already thought of adding a dimension as you wrote, but since I already managed to calculate a measure, that returns the nsd_r, I thought I could somehow grab this value with the function .currentMember. But I failed miserably.

Since the ultimate goal of my work is (when the cube is ready) to add some server-side conditional format that - when the data cube is used as data-source within ms excel - the cells are formated diffently conditional on the value of the computed variance.

Do you have a hint for me, how I should try to archieve this goal. Is it better to go for one single cube or 2 cubes that are somehow joined then?

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

A troublesome MDX-Statement 1

drunkenwallaby

Technical User

MDXer

Technical User

drunkenwallaby

Technical User

Similar threads

Part and Inventory Search

Sponsor