Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

dimensional model for a factless fact

Status
Not open for further replies.

gary8877

IS-IT--Management
Aug 15, 2006
7
CA
I am exploring the design options for the dimensional model of Health Care System. In this design we don't have any financial data or any sort of concrete measurements. The business process works as follows:

The patient enters into the Hospital or Health Services Center and answers some questions on an assessment form. And by the answered questions, it is determined what kind of services are needed by the patient. This entire process is termed as "in-take assessment" and focuses solely on assessment.

Mostly the reports will be measuring the count of assessments based on various dimensions. In this model we have Patient, Assessing Organization, Diagnosis (multivalued), Time, User (one who assesses), Assessment Type, Care Product, etc. as Dimensions and the Actual Assessments as the Factless Fact. The assessment process form consists of objective questions(defined values/radio button choices - mostly int or string) and open-ended questions (free text). There are approximately 100 questions in the form. The reporting criteria(where clause) can be one of the above mentioned dimensions or may be one of the answers given by the patient in the form. If I create one dimension per Q/A it will be a centipede design with too many dimensions.

Now my question is

*whether the assessment Fact should contain all answers in one row in different question columns (in which case, it will be a very big table - approximately 120 columns in one row with 30,000 Assessments each month) or

*should the questions be in a dimension, in which case the dimension will be as large as the fact, in terms of rows. It will act more like a fact table. It is like having a Primary Key in the Fact table Assessment and Foreign Key in the Assessment Q/A table or

*should I try to come up with various combinations of possible answers for these questions and then put them in a dimension, in which case the dimension again will be very large or

*should I distribute the questions in combinations of 20 each and then apply possible answer combinations in 5 different dimensions or

*should I create 1 dimension per questions/answer (centipede design)?

Any thoughts/suggestions/guidance will be really appreciated.
 
I would start off with something like this initially:

Assessment dimension: Assessment ID, Time, User, etc.

Question dimension: Question ID

Diagnosis dimension: Diagnosis ID

Answer dimension: Answer ID

Question fact: Assessment ID, Question ID, Answer ID, Answer Text (using a 0 Answer ID for free text)

Diagnosis fact: Assessment ID, Diagnosis ID
 
The assessment looks like a survey. Typical survey type data structures should suffice. Examples of tables in survey data structures.

Survey
SurveyResponder
SurveyQuestion
SurveyResponderAnswer
Question
QuestionValidAnswer
ResponderInfo

-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
 
Thanks for answering my question. So are u both suggesting to go with 1 row per assement question. In this case the fact will end up with 100 rows per assessment. And each patient is assessed multiple times, lets assume avg 5. So I will end up with 500 rows per patient. And they normally assess 30 thousand people per month. So this means 500*30,000 = 15000,000 rows per month and 15000,000 * 12 = 180000000 per year. This design is good in terms of only two joins. But in terms of no. of rows, performance might still be a problem. Any thoughts?
 
You could consider keeping the most recent assessment(s) in one table, and the earlier ones in another; available "joined" via a UNION view. Alot depends on what kinds of queries you are expecting.

-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
 
If you're worried about the performance, a fact table is a good candidate to randomly populate 180,000,000 rows - since you're mostly concerned about a few numerical values as opposed to string data. Populate a test table, and try out the performance. Depending on your DBMS, you may also have other alternatives such as partioned tables.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top