Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

One or more fact tables?

Status
Not open for further replies.

iliask

Programmer
Feb 19, 2009
9
0
0
GR
My question is about the fact table. Can I go with only one fact table or do I need a second one?

I’ll try to explain it with an example.

I’m in a hospital and I need to have the following data per clinic: (all the data are in day basis) (A hospital has many clinics).

1. Number of incoming patients per gender, age group, insurance type etc.
2. Number of outgoing patients per gender, age group, insurance type etc.
3. Number of transfers to another hospital again per gender, age group, insurance type but also by transfer reason.

Can number 3 go into the same fact table as 1 and 2? It has one more dimension, that is the transfer reason.

If I make a fact table like this:
1. Date
2. ClinicId
3. GenderId
4. AgeGroupId
5. InsuranceTypeId
6. TransferOutReasonId
|
7. NumberOfIncomingPatients
8. NumberOfOutgoingPatients
9. NumberOfTransfersOut

Can this work? On the same day I have
1) incoming patients with no TransferOutReasonId
2) and transfer out patients with different transfer reasons.

So I could have one record per day and dimension combination for incoming and outgoing and put 0 or null on the NumberOfTransfersOut and on TransferOutReasonId fields and then so many records as many transfer out reasons I have on that day and put 0 on the incoming and outgoing fields.

Is this good practice or not? Do I need so many fact tables depending on the combination of dimensions I have? I mean what if I have another measure that has Gender and Age as dimensions but not Insurance and perhaps it even has a new dimension let’s say disease type. Does this go in the same fact table or not? If not I have to make three fact tables to get statistical data for the same source "Clinic".

Thanks for any help
 
Firstly, I would try to make the fact table(s) more low level if you can. One record for each "event." A patient checked in -- record the event. A patient checked out -- record the event. A patient was transferred -- record the event. With this approach, you would also need to add a foreign key for the patient dimension. In fact, you would have to bring this down to the patient level, unless all transfers for one hospital on the same day always have the same "transfer reason."

I'm not sure how your organization is set up, and I know nothing about the health care industry, but assuming the logic you have multiple hospitals within the same organization, I would probably design the fact tables as follows:

Code:
PatientTransfers
----------
PatientKey,
DateKey,
OriginHospitalKey,
ArrivalHostpitalKey,
TransferReasonKey,
--your other Foreign Keys,
PatientCount -- Optional - Either always a "1" to summarize, or just using COUNT(*) logic

PatientArrivals
----------
PatientKey,
DateKey,
HospitalKey,
ArrivalTypeKey (either "walk-in", "transfer", etc)
--your other Foreign Keys,

PatientDepartures
----------
PatientKey,
DateKey,
HospitalKey,
DepartureTypeKey (either "released", "transfer", etc)
--your other Foreign Keys,

You could combine Arrivals and Departures into a single table, adding a dimension to specify Inbound/Outbound.

But what this gives you is the ability to track the total number of transfers for the organization without double dipping. It also gives you the ability to track the net movement of patients at a single hospital location.


However back to your original question, as a general rule, I try not to combine semi-related facts into a single fact table. If some dimensions simply do not apply to some of your facts, I don't like to code those records to an "N/A" or "Does Not Apply" value. Your end user/reporting tool can probably deal with combining measures from different fact tables for analysis. If you're using something like SSAS 2005/2008, it does a great job of allowing you to bring in data from multiple fact tables where the report's attributes are applicable.

And yes, this does duplicate data in a way. It should not be a problem if your ETL keeps everything in sync.

I'd be interested to see how others would model this.
 
I would not discard the idea of putting the facts together in 1 fact table. As they represent movements of patienst relative to the hospital. In that sense arrivals, departures and transfers ARE the same and should be collected in a single fact table. And yes, that means that the non-transfers have need for a N/A key in the transferoutreason.

As in many cases, it is the business requirements that should make you decide. If the business requires the aggregated data, and you do not anticipate this to change in the near future (within say 2 years), the design from Iliask is perfectly valid.
It is the difference between periodic snapshot facttables and transaction oriented fact tables. Each has its advantages and drawbacks for specific situations.

The advantage of Riverguy's approach is the ease with which you can handle late arriving facts. In the approach by Iliask, you have to provide in the ETL for this.
 
RiverGuy has a good idea regarding granularity at the event level. However, you may need a daily summary table (aggregation table) for each clinic as well. And depending on the objectives, the granular event table may not be needed. However, my experience has shown me that it is better to be flexible in design as the users will change direction or want to dig deeper after they complete their "analysis of the month".

-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
 
Thanks for your answers. I did go with the one table approach for the time being. As I'm new to SSAS I will try to balance ease of use and end user ease of use.

I'd like to have all data in one table as long it does not give problems to end users when trying to do their own reports.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top