Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to speed up Updates

Status
Not open for further replies.

azzazzello

Technical User
Jan 25, 2005
297
US
Hi guys,

I have a set of 16 tables TABLE_0 through TABLE_F, where a primary key for each entry is a hexadecimal number 32 digits long. Each table contains entries starting with a corresponding HEX digit (so TABLE_0 contains numbers starting with 0, TABLE_D - with D's, etc). There is approximately the same number of records in each table, and that number is currently 1,800,700. Which makes about 30,000,000 records in the set of 16. Every day about 3,000,000 records are processed such that basically if an entry exists, it is updated with new information - if it does not, an insert is made. They also get equally spread up between the 16 tables. 95% of these daily 3M records are UPDATES - only about 150,000 are new inserts. I am doing an multiple INSERT ... ON DUPLICATE KEY UPDATE, a 1000 records at a time per table. There are 3 indexes on each table, one on ID and 2 on DATE fields, one of which only gets set on new INSERTS and never updated after that.
Now my problem. It is dreadfully slow :). 100,000 records are processed in about 10-12 minutes. As I said, I run multiple insert/updates with 1000 a pop, and I optimize the tables every day before the processing. All the tables are MyISAM, MySQL version is 4.1.15, the server is a dual 3.6Ghz Xeon with 4G of RAM. I have verified that it is the updates that are taking this long, because if I delete the entries needing updates beforehand and then run the script, it will take much, much faster (5 - 6 times faster). I realize that if I have a table for each of the first 2 hexes instead of just one, I will have 16 times less entries to look through, but I would rather not have to manage 256 tables. I am using perl to processes and DBI to interface with MySQL if that matters. Any suggestions as to how I can improve the speed of this system? Perhaps INSERT...ON DUPLICATE KEY UPDATE is not the best thing to do if I know that 95% will be updates? Thank you in advance
 
When you batch the updates 1000 at a time, do you use 1000 individual insert/update statements, or one big statement? The latter would be much faster, as there is only one query-compilation process, and the indexes would only be updated at the end of the batch instead of after every record.

You could also have a look at your cache settings in my.cnf. If the cache is big enough, disc writes will be delayed until the server is idle, thus improving insert/update performance.
 
I use multiple insert syntax, so it's 1 statement commiting 1000 values.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top