Effective programming

phadobas · Jan 16, 2013

I'm writing a program that stores about 30,000 new records in a mysql database every day. These are call records from a phone system. I'm writing it in python, but I imagine the programming language is not relevant for my question (correct me if I'm wrong).
While the phone system reports certian data (extension that made the call, date and time of the call, etc), some other information needs to be calculated (the price of the call), or determinine the destination of the call (city, state or country), before it can be stored away. To determine the country or domestic destination of a call, the program needs to look up the destination from the mysql database, based on the number dialed. This destination table has some 130,000 entries all by itself. In other words, about 30,000 times a day, I need to do a search in the destination table that has 130,000 entries, and then store a new record in another table in the database.
What's the best way of doing this, if I want to avoid data corruption in case the computer were to crash, but want to keep it effective?
Should I just read the destination table into memory when the program starts up, and do the search from there 30,000 times a day? Or will that be too much even for a server-grade computer, and should I just look the data up from the mysql database 30,000 times a day?
And about storing 30,000 record per day... Should I just open a mysql connection at the startup of the program and almost never close it (since the program runs 24/7), or should I open and close the mysql connection 30,000 times a day? If I leave the connection open all the time, will I lose data in case the computer crashes?

DonQuichote · Jan 18, 2013

Opening and closing a MySQL connection is pretty fast, and a connection also acts like a session (for example, SQL variables live as long as the connection lasts). So unless opening and closing a connection really gets in the way, be nice and use a connection per request or session. Using different connections is also highly useful for multi-user stuff. In fact, I always use a separate connection for error logging as well, just to prevent a ROLLBACK clearing a log (this connection is only opened on demand).

Loading the entire table into memory may not be useful, but there are other forms of caching. You could load the top 10% into memory, for example, or keep the last needed 100 records into memory. This, off course, is highly situation-specific, so you are better suited to judge on that.

Off course, caching is best done on data that does not change. On such data, you may find the HANDLER statement useful for its speed.

Leaving a connection open will not lose you data unless you keep a transaction open.

Good luck!

+++ Despite being wrong in every important aspect, that is a very good analogy +++
Hex (in Darwin's Watch)

phadobas · Jan 18, 2013

Thank you. Very useful information.

KarveR · Feb 1, 2013

Would it be more prudent to use a stored procedure to do an "after insert" update for the location of the call?
This will save you doing an additional lookup back to your app, and just leaves the database to do its thing on simple inserts.

______________________________________________________________________
There's no present like the time, they say. - Henry's Cat.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Effective programming

phadobas

Technical User

DonQuichote

Programmer

phadobas

Technical User

KarveR

MIS

Similar threads

Part and Inventory Search

Sponsor