I am designing a datamart with 2 facts table. The dimension of one of these tables is very big, and therefore cannot be moved from the ODS to the datamart. How can I have a fact table in a datamart and one of its dimension in the data store ?
From a business delivery point of view - why not? It's our job to deliver what the end users need. They should have no care about where the data is coming from as long as their requirements are met (access time, ad-hoc queries, etc).
From an implementation point of view, there are probably quite a few obstacles to overcome. For instance, the ODS and datamart are probably on different databases, servers. The naming standards may be different. Data elements may have different formats (char vs varchar, number vs integer0 and different sizes. The update time and frequency between the ODS and datamart may be hours or days apart, resulting in possible orphan or mismatch records. Making the cross database/platform join may not be possible with existing query tools. I think you get the idea.
In summary, remember it is our mission to meet the user community's requirements. They should have no care how we do that. In addition, remember that all the DW standards and architecture guidelines are based on solutions already implemented. You may need a new breed of solution.
-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
I agree with you on the obstacles, although, I am not too worried about it as the store is well designed and sources are consistent.
In term of "how", I am thinking dblink, but a bit worried about the performances.
Any suggestions ?
What are the differences for the ODS and Datamart?
Database and version? Server and ServerOS? Update frequency and latency?
Another idea I just thought of.....might have benefits...
Have part of your dimension in the datamart (the most popular values of the domain) and the rest in the ODS overlaid by a View which is a UNION query of the two tables. Might have benefits if your data is predictable as far as user access "popularity".
-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
my question is what you define as very big? If the number of records are the issue then depending on what platform your on then there are bound to be methods or tools to address this.
linking the servers will defintely come with a performance hit not only to the server resources but also to the network resources for the part of the network the servers exist on. If the number of records again is the issue then it seems like over the long term you will waste more time and resources feeding data over a link if users are requesting large numbers of dimensional records on a regular basis.
a few more details on the reasoning and size of the data might propmt more creative solutions.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.