Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Large amount of database synchronize Large amount of database synchron 2

Status
Not open for further replies.

zyzjacky

Programmer
Jan 19, 2006
76
MT
If I have two tables that both contain same 1million rows of data, and if data from one has been changed (a little changes, some deletes, updates, and inserts). How would I do the synchronize between these two tables (using PHP)? any suggest algorithm?
 
Hmmm...Can the replication be done between different databases? e.g. MySQL and MSSQL? and the master database which I cannot control. For example, I am using MySQL as my backend, and I am importing data from more than one tables of another database (MSSQL), and merge everything into one table in my MySQL. (use select...join to get all data i need, and insert into one table in mysql). Now, the original data has been changed, and what is the best way to synchronize changes to mysql?
 
is there a timestamp column or dateupdated column inthe SQL Server db? Those could be used to determine what records were changed.

or you could creatively use checksum and create a column storing that inthe mysql database (since I don't know if mysql has this or if it would compute the same value that SQLServer does) and then compare that value to the checksum of the orginal. Might get a bit complicated with denormalizing the data but it should be possible.

"NOTHING is more important in a database than integrity." ESquared
 
Hi Jacky,

okay, if you don't have control of the master database other than read access it's hard to keep in sync without a full table scan for changes.

From MSSQL you can

Code:
Select GUID, Checksum(*) as CheckSum FROM TABLE

Where GUID represents a field which is the primary key of the table. Store the Checksum with the records in your mYSQL data and when you need to update you need to query all checksums again and update records, for which the checksum changed.

This still would be a million rows, just simpler rows with just the primary key field and a checksum field.

You're much better off, if there is some kind of LastUpdate field in the table you could query to be higher than the max value you have.

Bye, Olaf.
 
Thanks to SQLSister and OlafDoschke. I will try to use the checksum.
 
If you'd find a way to compute the checksum on chunks of records you could narrow down tha parts of the table that changed.

A totally different approach would be system data, transaction log and such things, but you'd need administrative rights to get at those things.

Bye, Olaf.
 
I know in SQL Server you can do a checksum on just the fields you are interested in.

"NOTHING is more important in a database than integrity." ESquared
 
Hi SQLSister,

Seems so. It would be nice if you could have the checksum on say groups of 256 records, to narrow down where changes have been made. then break down the chunks into smaller pieces until you're at the row level.

But that would mean processing the table even more often on the server side, it would just reduce the traffic.

Bye, Olaf.
 
One more though, Jacky,

if you have read access to the MSSQL Server database, I suppose you either have a hoster that provides this server or a contact with the webmaster of that database.

If you pull out all checksums just to find out the lower than 1% changes in the table you're interested in, you're still causing more traffic on the server than if the DBA would simply replicate data of the table you're interested in into your database as the slave. It simply is by far the best method both in performance and costs to sync tables both for you and the database owner.

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top