Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

MYSQL suitability for large data processing 1

Status
Not open for further replies.

scriggs

IS-IT--Management
Jun 1, 2004
286
GB
I am setting up a PHP driven reporting tool and am trying to select my backend.

I will be receiving data from multiple sources, probably about a million transactions a day to start, maybe increasing in the future. The data will be quite minimal, maybe 20 fields of data - the style will be something like web log records.

I plan to add these to a single table, so by the end year 1 I expect 365million records, maybe nearer 600million. This will grow exponentially as years go on.

Can MYSQL handle this?
How fast will it process a query on that data?
Should I look to split the data down somehow (don't know how).
 
MySQL can certainly handle that volume of traffic and data (see If the tables are well indexed, queries can be very fast; MySQL claims to be one of the fastest databases around. Regarding table size, if you are expecting say 100 bytes per record and 500M records per year, that works out at 250GB after 5 years. If your filesystem and hardware can easily handle a file of that size, then you don't need to split it. However, it might make more sense to have a table for each year (or even month), as older data would presumably be less relevant than recent data, and having more reasonably-sized files would ease pressure on your system. However, it would complicate queries, but that's one of the compromises inevitable with such a big database.
 
Thanks for the reply, that's ace! A couple of queries you could help me with:

How would i work out number of bytes per record?
How could i translate that into GB of space?
As far as splitting goes, presumably I could use a UNION query to add multiple months back together? Or is there a better way?

Thanks.
 
Though of another query, any advice/tutorials on how to index the data? I've never had to deal with so much data or worried about it too much before!
 
Each field in a record uses a certain number of bytes, depending on the data type and size; the MySQL manual gives details of this. You would also need to allow for the size of indexes; each index record would contain the field value and an address.

Indexes are used to improve the speed of queries (as well as ensure non-duplication of values, in the case of a unique index), but they have the drawback that they reduce the speed of inserts and updates. So you would only index a particular field if it is likely to produce a noticeable improvement overall. Specifying indexes is very easy - you can do it when creating the table, and/or add them at a later stage; it's all explained in the manual.

Another aspect you might need to look into is tuning the server. With such a high volume of traffic and data, things like caching become significant.

UNION is certainly a good way to deal with multiple identically-structured tables.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top