drunkenwallaby
Technical User
Hi everyone.
I'm having trouble to get a MDX-statement to work and I hope somebody around here may point me to some helpful tutorials or similar stuff.
To start with, I'm a statistican who is quite familar with SQL queries but I've not managed yet to create a MDX-statement to calculate a specific computed measure. Let me explain shortly the setup of my (very simple) data-cube. It consists of only 6 dimensions. Those are:
- region (9 parameter values, star-scheme)
- age (10 parameter values, star-scheme)
- education (4 parameter values, star-scheme)
- nationality (4 parameter values, star-scheme)
- labour-force status (4 parameter values, star-scheme)
- sex (2 parameter values, star-scheme)
The population is divided in a lot of subgroups, which means every possible combination of parameter values of the dimensions may be considered as such a subgroup. Example: Subgroup "d" is defined as all objects, that live in region '1', their labour force status is '2' and their sex is '1'.
With data coming from a sample survey, the total domain size N_d in any of the possible subgroups G_d is estimated with the help of some (known) weights w. To make a long story short, the values in the cells of the data-cube are point estimates \hat N_d for N_d, the total domain size of group "d" in the population.
My task is now, to compute a variance estimator with the following formuar (1) by using a MDX-Statement. The variance for a small domain "d" within a specific region "r" is given by:
Var(\hat N_d) = \sum_{r=1}^{9} (N_r)^2 * (1-f_r) * (pd (1-pd))/(n_r-1) (1)
where
- N_r ... total population size (estimated domain size for region r)
- n_r ... sample size for region r
- f_r ... "sample fraction", which can be calculated as: n_r / N_r
- nsd_r ... number of sample objects in the selected subgroup "d" in region r
- pd ... nsd_r / n_r
My problem is that I have no clue how to create an MDX-dtatement that calculates those values simultanely for all the cells that are returned as a result set from the data-cube when it is processed. The biggest problem that I have is, that I don't know if I can use information stored in a table that is neither a dimension, nor a measure to compute calculated measures. If this was possible, I think implementing formula (2) isn't that easy. I could store all the constants in a separte table and most of the work would be to get the number nsd_r out of the cube...
I'd be very glad if someone has some hints for me. Really, and I'd happily pay lots of beer of stuff for the one who guides me to a successful MDX-statement
I'm having trouble to get a MDX-statement to work and I hope somebody around here may point me to some helpful tutorials or similar stuff.
To start with, I'm a statistican who is quite familar with SQL queries but I've not managed yet to create a MDX-statement to calculate a specific computed measure. Let me explain shortly the setup of my (very simple) data-cube. It consists of only 6 dimensions. Those are:
- region (9 parameter values, star-scheme)
- age (10 parameter values, star-scheme)
- education (4 parameter values, star-scheme)
- nationality (4 parameter values, star-scheme)
- labour-force status (4 parameter values, star-scheme)
- sex (2 parameter values, star-scheme)
The population is divided in a lot of subgroups, which means every possible combination of parameter values of the dimensions may be considered as such a subgroup. Example: Subgroup "d" is defined as all objects, that live in region '1', their labour force status is '2' and their sex is '1'.
With data coming from a sample survey, the total domain size N_d in any of the possible subgroups G_d is estimated with the help of some (known) weights w. To make a long story short, the values in the cells of the data-cube are point estimates \hat N_d for N_d, the total domain size of group "d" in the population.
My task is now, to compute a variance estimator with the following formuar (1) by using a MDX-Statement. The variance for a small domain "d" within a specific region "r" is given by:
Var(\hat N_d) = \sum_{r=1}^{9} (N_r)^2 * (1-f_r) * (pd (1-pd))/(n_r-1) (1)
where
- N_r ... total population size (estimated domain size for region r)
- n_r ... sample size for region r
- f_r ... "sample fraction", which can be calculated as: n_r / N_r
- nsd_r ... number of sample objects in the selected subgroup "d" in region r
- pd ... nsd_r / n_r
My problem is that I have no clue how to create an MDX-dtatement that calculates those values simultanely for all the cells that are returned as a result set from the data-cube when it is processed. The biggest problem that I have is, that I don't know if I can use information stored in a table that is neither a dimension, nor a measure to compute calculated measures. If this was possible, I think implementing formula (2) isn't that easy. I could store all the constants in a separte table and most of the work would be to get the number nsd_r out of the cube...
I'd be very glad if someone has some hints for me. Really, and I'd happily pay lots of beer of stuff for the one who guides me to a successful MDX-statement