Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

General DB question - looking for best option

Status
Not open for further replies.

WorkingMom3

Programmer
Jun 20, 2008
15
US
I need some suggestions/advice.
I've been asked to write an application for a new venture on which a friend is about to embark.

We're looking at no more than a 1 year turn around.

In the Microsoft arena, I've done C++ (6.0 and some .net) for the last 8 years, VB and VBA programming...off and on...for about 7 years, as well as experience with many other languages.

This application would need to I/F with a database of hundreds of thousands of clients. We'd need the ability to archive client data when expired (without removing it) as well as generate reports & statistical charts. I've never written an application like this before (using a databse) and trying to determine the best language and database to use...one that places the least amount of restrictions on the customers.

So I guess the question is...what are the options...what could provide the easiest turn around...work best with C++ or VB...how can I find out my options?? I'm a little overwhelmed not understanding my choices.

Thanks in advance.
 
This is very true, TPS is basically worthless when dealing with OLAP.

Can the op tell us if it's OLAP or OLTP?

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
"I don't know the size but we have tables that are storing case information going back 25 years with tables that contain millions of records...defendants, charges, hearings, sentences."

What kind of database are you using?

It's an IBM iSeries (previously the AS400) that uses a DB2 database.

Leslie

Have you met Hardy Heron?
 
You guys are great.

The data is what I would call critical data, however, for this initial release, I'm finding the data is rather simple, if that makes sense. I'm finding that most of my data fits this pattern:

tblClientTable
clientId
OtherClientFields

tblClientReportLink
clientId
reportId

tblClientReports
reportId
otherReportData
Other Fields

with clientId being the "centric" ID, if you will. (Yeah...no...I'm not yet there on DB lingo. I have to look up practically everything you guys say...but I'll get there)

Pulled the following off a google search on OLTP vs OLAP. My “answers” are in bold and annotated with ***

Purpose
OLTP: To support business processes.
OLAP: To assist with making better business decisions.
** both intentionally selected

Where the data comes from
OLTP: Online entry, point of sale devices, data loggers etc.
OLAP: OLTP databases, flat files.

Criticality
OLTP: Generally considered missioncritical. Regularly back up to protect from data loss.
OLAP: Important to decision support; however; organizations can continue to operate without it. Back up less often than OLTP databases.

Database Size
This was the original question. Unsure of how to answer this as of yet
OLTP: Relatively small database size if the history information is regularly archived. Can quickly bloat if history is not removed.
OLAP: OLAP database can be very large. Contains historical information generated by various OLTP databases within an organization.

Concurrent Users
OLTP: Potentially very high numbers of users. Some OLTP databases support thousands of concurrent users.
OLAP: OLAP databases support a much lower number of users than their OLTP counterparts. ** expect anywhere from 10 to 50 concurrent users

Response Time
OLTP: All application queries should complete in less than a second.
OLAP: Queries commonly take seconds, minutes, and hours.

Data Changes
OLTP: High number of insert, update, and delete operations are generate by user transactions. ** no deletes – inserts and updates only – corrections handled by annotated update
OLAP: Very low number of data changes generated by users. Data is updated by bulk import batches.

Ad hoc querying
Unsure of how to answer this. Queries will be “predefined” but tools will be created to allow creation of limited “user defined” queries.
OLTP: Poor ad hoc query performance. Indexes are only created to support defined application queries.
OLAP: Very good ad hoc query performance. Indexes are created to support queries possible queries.

Querying complexity
OLTP: Data is highly normalized requiring many joins to retrieve information.
OLAP: Data is denormalized requiring few joins to retrieve information.

If I'm correct in my understanding of normalized vs denormalized data, the tables are completely normalized (which I do understand could possibly have issues but none have arisen yet)

It would appear that OLTP would be the answer to the question.

I guess I need a better understanding of TPS.

 
Actually, you need both OLTP and OLAP. If your application supports business processes, this suggests that transactions are involved (orders, workflows, whatever). Therefore, the business process components should be supported by an OLTP database.

The reporting component should be underpinned by an OLAP database. This database will of course be fed by the OLTP.

One thing to remember when you design a system that supports transactional business processes: NEVER run reports directly against the transactional database. Have a process in place that extracts the OLTP data, transforms it to the proper cubed format, and loads it into your OLAP reporting database. This ensures that transactional performance will remain stable and that you have the right schema in place for a flexible reporting solution without rejiggering the OLTP schema.

My US$.02

Phil H.
Some Bank
-----------
Time's fun when you're having flies.
 
Phil, thanks for your 2 cents. Any chance you could expand just a little. My impression of OLTP vs OLAP was that they are database design paradigms. Maybe you could recommend a good book.

And, back somewhat to the original question...does it seem like I can use Access?
 
I am an Access fan, unlike many of the people above :), yet I would have to say that I do not think that this project is suitable for Access unless the database will only be used by one or two people for statistics, reporting and related items.
 
Based on the information provided, I'm thinking that Access won't be the correct solution for you. I'd probably recommend SQL Server Express Edition for this project.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
The original posts leads me to believe that the transactional systems already exist so a OLTP system probably isn't needed. As for OLAP versus OLTP they are design methodologies for specific types of systems.

The briefest comparisons are:
OLTP: Designed to quickly insert data.
OLAP: Designed to quickly retrieve data.

Granted that is like the most basic explination but the fact that the 2 are support the opposite of each other, which means that most everything in the design will be different.

I believe there are better things to worry about at this point than the platform. Some points to consider first are.

Primary purpose? To me it sounds like it is data archival and reporting.

User Scalability? How many users will you need to support for the reporting piece. This can have an impact as in a 3NF system the overhead of mutiple users can seriously impat performance. Where a Star Schema has much less overhead.

Data Scalability: How much data will you begin with, how much data do anticipate in 6 months? 1 year what is the anticipated growth in volume and how many years of history will you retain and load initially.

User Ability: how tech savy are yuor users, do they require simple reports that are delivered at the click of a button, or do they require parameterized reports or maybe ad-hoc capabilities. This is important because this is where the users will spend their time and why reinvent the wheel when there are dozens of applications already built to do this, but some applications target specific backends.

Good luck!

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
I'm trying but it really seems as if all this is going around in circles.

Would anyone be able to give me example answers to Paul's questions? Why are there better things to worry about than the platform? Isn't that the reason for asking the questions he's asking...to determine the platform?

Primary Purpose: His statement is correct.
User Scalability: Between 20 and 40 users at one time
Data Scalability: While still trying to grasp that myself because the application is new to me, my strong feeling based on what I've completed so far is that data will not grow at a huge rate. That is, input during a single day, month and year is somewhat constrained by the nature of the business. After a certain period of time, some customer accounts will be archived (just put "in a box on a shelf" just in case with no real active intent of accessing them).
User Ability No, I do not expect users to be tech savy. I will need to automate report generation for them. In addition to the push-a-button-reports, I will need to create a mechanism for them to "create" their own reports (that is, modify a push-a-button-report based on dialog box selections)

And though I said it before...thank you so much for all your time!





 
My reasoning for saying other things are more important than platform is that the design to some degree should be platform independent. I don't design a DW system for SQL Server I design the DW system according to the business needs. If it is to be a fairly small system with minimal users then a smaller DB engine could probably handle it. IF it is a large system with many users then other requirements come into play. 30-40 concurrent users on even a medium size database doing statistical/analytic reporting will have an impact on what technologies I use. The technical abilities of my audience will also impact my decisions. Now wouldn't you look good if you went to them and said we can do this in MySQL to only later realize that intial scope was really lacking? Then you realize that users need response times measures in 0-3 seconds? that dashboarding tools need to be implemented? Tools in your chosen technologies don't easily lend themselves to report creation by the non technical savy?

Archiving and Reporting is what a Data Warehouse and Data Marts are designed for.

Reporting well you have a ton of tools designed for presentation. Cognos, ProClarity, Sql Server Reporting Services, Performance Point, AlphaBlox, Business Objects and Micro Strategies are just some of the bigger names. Each with its pros and cons and some specific to a db platform.

Scalability becomes an issue more on response times to user requests and the amount of data. Most users want a reporting system to spit the reports out with little to no lag which becomes an even greater challenge when you increase concurrent user count. A change in data volume may increase your ETL times if your data increases 10% should it take 10% more time to load and can you afford that time. Again these answers may impact your platform decision.



Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top