Lots have been talked about the surrogate keys improving the join performance. Has anybody really tested it? Can someone point me to some benchmarking test results? Any help in this direction is much appreciated.
From my experience an integer surrogate key speeds up join performance since it usually replaces a longer or character-based natural key, thus reducing disk I/O. It also may speed up indexing on that column for the same reason.
Thanks for the posting. I did come across the first two links with some googling, before posting the message here.
As you may have already noted in Kimball's article he says that by using the surrogate keys that the join performance is *likely* to improve. He also underlines the fact that it has not been proven. I am wondering if anyone experienced any such performance boost.
I am meeting with our DBA this week for whom the datawarehousing principles are a total anathema. If I can see some real benchmark results of surrogate key vs natural key, that would be a big selling point for me.
When you ask about advantages of staging area and ODS, do you mean compared to a data warehouse? I'll assume such for my response, but correct me if I'm wrong.
An ODS is typically set up for a smaller data set than a data warehouse, and expected to answer queries with fast response times. It's usually set up and optimized for just one operational system, whereas a data warehouse typically covers multiple business areas. An ODS can usually only answer a limited number of questions but does it rapidly.
A staging area is basically where source data for a data warehouse is loaded prior to populating the actual data warehouse tables. An ODS might not use a staging area, but most DW implementations do. Some are persistent staging areas which remain and are archived in that state, and some are deleted after each load.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.