Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Dimensions, facts and a classical example

Status
Not open for further replies.

jcoira

Programmer
Sep 26, 2001
51
US
Hi,

Lets take the well known case study of customers, products, etc. Suppose I sell a product to a customer and store the sale in my fact table (which grain is the sale). But after 15 days the customer gives the product back. What would you advise to do in that case to represent that event in my fact table to keep aggregated data accurate ?

Also, suppose that a product belongs to more than one category and of course a category holds more than one product (that is, a many to many relationship). How do I build my product dimension and moreover how do I link a product to my dimension table. Must I include two keys (product and category) in my fact table even when both products and categories are stored in the same dimension? (suppose no surrogate keys are used at this time).

Any help would be really appreciated.

Thanks

// jcoira
 
1. A customer returning a product is a significant event. I would probably include this as a negative sale. The return may be in a different time period, this is entirely the sort of fact you might want to explore, e.g. to explain why one month was better than the next.

2. Look carefully if a product actually is in two categories. Maybe one category is really a subcategory of the other? Normally someone is in charge of the category and hence that product. Product will generally have fields like department, category, subcategory, brand, package size, SKU giving you a hierarchy. Jeremy Nicholson, Director of a UK-based Java and Data Warehousing consultancy
 
1. Another solution could be go to the fact table and remove the entry for the returned product (in case I dont care to loose that historical piece of data). But Im not sure about the performance of such operation in a large fact table.

2- Yes. I have only products and categories in a many to many relationship and I cannot change this structure (as usual). How to deal with this kind of issues ?

Thanks,

// jcoira
 
hi,
How can we remove the entry from a fact table. The fact that there was a sale for that product on that particular date will be lost. Then how can we ensure validity of the fact table. when you aggregate the value might show ok depending on the return procedure when partial is not allowed.
So as jeremy suggested it should be another reverse entry in the fact table.
 
I agree with you guys regarding the first issue, but what about the second issue ? How to deal with a many to many relationship in one of my dimensions ?

Thanks,

//jcoira
 
1. Typically you record a negative sale and has a reason code dimension (straight sale, returns, write-off, shrink etc.). On an aggregate table depends on if you have the reason code dimension. If you don't, they cancel out each other because of the negative sales number.

2. Very simple. You build a Category-product relationship table. If you report on category, you join this table with the sales table. Otherwise you don't. That's it. I am not sure how it's handle in MD cubes. But in ROLAP, this is not a problem at all. You don't really need to worry about dimensions. The relationship is all that matters.


 
1) Record the return. If you are dealing with an ERP system like SAP it has it's own order type... A return is basically a reverse sale. You want to capture it in your fact table because you (should) want to do analysis on WHY customers are returning your product, how often, etc. Is is always returned if Sales person X is selling it...etc. etc.

2) If you have a product that can fall into multiple categories...many to many relationship...how would you EVER know which category to assign to either the fact table or assign from your material dimension table?? You have to base it on a business rule. You SHOULD have the categories under the material dimension table and you SHOULD have a surrogate ID representing one or more of the materials.

Hard to come up with a scenario where the product falls into multiple categories. Might be based on what the product is to be used for which means it's dependent on the customer's intent. Might be based on what department it's sold in...which you may not know on the Return Order. Which means you'd need a GENERIC or unknown category for that instance.



 
Thanks guys.

First issue is clear for me now, but what about the second one?

Lets take another sample to help clarify the many to many relationship I need help with.

I have CONTRACTS to fulfill and TRUCKS carring goods to fulfill the contracts. One contract usually needs more than one trip to be fulfilled (several actually), but at the same time one trip could carry goods assigned to different contracts. That's my many to many relationship.

Also, my CONTRACTS level should be the parent level for the TRUCKS one, so I can drill down to see all TRUCKS belonging to a specific contract.

At the end I want to be able to drill down a hierarchy like:

-Contract1
-- Truck 1
-- Truck 2
-- Truck 3
-Contract2
-- Truck 2 (again)
-- Truck 4
-Contract3
--Truck 4 (again)
--Truck 5

and so on. This is a sort of parent child relationship I need to build on the cube and it has sense from the business logic point of view.

Any help with that ?

// jcoira
 
Not sure why there's confusion here. You can have the following tables:
1. Contract - stores all info about contracts at contract_id level
2. Truck - stores all info about truck at truck id level
3. Contract_truck_relat - table stores contract and truck M:M relations.

That's all. You can join table 3, 2, and 1 to get everything you needed. I don't see any problem.

If you are in a cube, that's still not a problem. You may design a dimension from contract to truck, or truck to contract. Sure you may have some duplication in record, but that's expected. You just need to be careful calculating the totals, because the M:M relationship, you need to take care of the unwanted double-counting. Your cube maybe a little fatter than ideal, but still not so bad.

If you are not in a cube, you are all set to go. MicroStrategy, Brio, even BO can handle this correctly and easily.


 
One more way is to create a factless fact able.
the columns in this table will be

tripid -- surrograte key generated for each trip
or use code from OLTP
tripdate -- date of trip from delivery date in contract
contractno
truckno


this tablw will be joined to your conform dimension truck and contract

 
Hi,

Following Subhashab approach I could have a dimension CONTRCAT_TRUCKS with fields

surrogate_key
ContractNo- Contract number from OLTP
TruckNo - Truck number from OLTP

and therefore with one record for every contract, truck pair received from OLTP, linked to fact table using the surrogate_key.

Does it have sense to you ?

Thanks,

// jcoira
 
Absolutely. You may not need the surrogate_key linked to the factables. Contract_No and Truck_No as compound key should take you there. It depends on the tool.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top