Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

rookie's question..creating a DWH on SQL Server 2005 1

Status
Not open for further replies.

Venky17

IS-IT--Management
Oct 27, 2007
2
US
Dear DWH Gurus

I just entered into DWH field, so please excuse me for a detailed problem definition. My Company has a normalized Seibel CRM database which maintains the relationships of its departments to external Institutions/Organization. This particular Siebel Server Production Database (PROD_DB) sits on SQL Server 2005. In the year 2000, my company didn't realize the importance of Data Warehousing concept and went ahead by creating enormously larger amounts of Views on VIEW_DB(another database) for analysis and reporting purposes. As time progessed, VIEW_DB has become a huge complex database full of tables which infact are views of the tables on PROD_DB because of which the response time has increased and Web-Interface to this VIEW_DB has horribly slowed down. My manager has asked me to understand all the views(tables) on VIEW_DB which are complicated and confusing, to develop a seperate DWH on a different SQL Server 2005 server(no siebel for analysis services). Certain columns in views(tables on VIEW_DB) are straight queries without any source column name and has alias in the result set. So, to develop a de-normalized structure of this VIEW_DB will take weeks of time to lay the foundation for DWH on SQL Server 2005. What I thought is, instead of wasting my time understanding all the views and queries(columns) on VIEW_DB, I can spend my time on tables of PROD_DB where all the tables are in normalized form having defined columns, from which I can list out all the facts,measurable, additive non-key fields of all the tables to come up with a fact table and then develop all the dimension tables which are segregated as per business processes in my company. Please let me know if I am correct or please advise any other suggestion. As I am rookie, I am expecting an indetail reply. I apologize here for being demanding and specific. Its just a request. To be frank, this would save my job.

Thank you
Naren
 
Well Naren,

I think the approach you propose is a good one.
Make a choice if you want your datawarehouse to be entirely dimensional or do you want some form of ODS where you store timestamped variations of your source data to construct history on these source tables.

I would start by identifying one business area and follow the 'datawarehousing cookbook' (e.g. the datawarehouse toolkit by Ralph kimball) for finding the needed dimensions and facts. After that you should design for the ETL proces (and the ETL intermediate schema's).
Get access to some people who know the business definitions tat are contained in the views.

Don't try to get all fact tables and all dimensions ready in one stroke. You will need the experience gained by working on one fact table to refine your ETL architecture.
If you are really pressed for time, take a subset of one fatct-table and later extend it.

Make sure your ETL proces generates run-time metadata (start/end/status of jobs and jobsteps and a complete call-stack whenever an error occurs.)

Do some data profiling to estimate how dirty the source data is (like unexpected null-values, or uniqueness violations) and communicate them to your boss as it takes extra time to clean this data (and the best place to clean it is the source system).

Oh, and make sure your boss understands the complexity of the task you have been given ;-)
If you have no datawarehouse to start with, it may easily take several weeks before you have any tangible results.
 
Hi Hans

Thank you very much. Anyways, as I have consulted with business users and other programmers, there is no need of de-normalizing the entire Siebel CRM database. Only a specific number of tables under each business process have to be de-normalized. I've already started de-normalizing the tables. Any tips and tricks are always welcome. I'll bring this thread back up when I am into development of ETL architecture.

I thank you once again Hans.

Regards
Naren
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top