Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Inserting records in bulk, but also checking for duplicates

Status
Not open for further replies.

LucL

Programmer
Jan 23, 2006
117
US
Hi Guys,

I've got a database that contains about 2 million records.

I also have a form that allows the user to insert 500 new records at a time.

I'm trying to figure out the best way to add the new records in bulk but also check to make sure they are not already in the database (only one column matters).

my insert command looks something like.

"INSERT INTO db1 (col1,col2,col3) VALUES (v1,v2,v3),(v4,v5,v6),etc"

Right now I'm dumping all of the col1 (the only column that I want compared) into an array in perl then scanning that array for duplicates and insert the data which is not a duplicate but this method if a bit flawed sine I have to dump all the COL1 records into an array and these can take up quite a bit of RAM.

Would using a "SELECT" statement with the 500 new records (ex: (SELECT col1 FROM db1 WHERE col1=v1) OR (SELECT col1 FROM db1 WHERE col1=v2) OR etc...") to check for duplicates be better?

Thanks!
Luc L.

 
Check out the DISTINCT keyword for your SELECT statement

________________________________________________________________
If you want to get the best response to a question, please check out FAQ222-2244 first.
'If we're supposed to work in Hex, why have we only got A fingers?'
Drive a Steam Roller
 
have you declared a unique constraint on the target column?

this will allow you to avoid having to check for duplicates altogether (since the database engine will do it for you)

r937.com | rudy.ca
 
Also be sure to leave off the "or die;" section in your script's insert sql. When it encounters a dup, it will simply write to sdtout (if in text mode) and skip. If you're running a web app and also using sendmail, keep an eye on the mailboxes. If updating within a script, the dups may show as errors in the mailbox of the user running the script. The messages could queue up and eventually fill your disk.

Later,
Mark
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top