I have click stream data such as referring url, top landing pages, top exit pages and metrics such as pageviews, number of visits, bounces all in Google Analytics. There is no database yet where all this information might be stored. I am required to build a data warehouse from scratch(which I believe is known as webhouse) from this data. So I need to extract data from Google Analytics and load it into a warehouse on a daily automated basis. My questions are:-
1)Is it possible? Every day data increases (some in terms of metrics or measures such as visits and some in terms of new referring sites), how would the process of loading the warehouse go about?
2)What ETL tool would help me to achieve this? Pentaho I believe has a way to pull out data from Google Analytics, has anyone used it? How does that process go?
3)How does Google Analytics interface with Pentaho and in what ways can you use the features from Analytics right inside Pentaho?
Any references, links would be appreciated besides answers.
1)Is it possible? Every day data increases (some in terms of metrics or measures such as visits and some in terms of new referring sites), how would the process of loading the warehouse go about?
2)What ETL tool would help me to achieve this? Pentaho I believe has a way to pull out data from Google Analytics, has anyone used it? How does that process go?
3)How does Google Analytics interface with Pentaho and in what ways can you use the features from Analytics right inside Pentaho?
Any references, links would be appreciated besides answers.