Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Deleting large number of records in DB2

Status
Not open for further replies.

bunzo99

Technical User
Jan 16, 2006
3
US
Hi all,

I have "some" OJT in DB2 v7.2 and have to support a db that collects about 100 million records a month. We don't have the disk space to keep all of this data so I have to delete records where Month(datefield) = Month(Current Date) - 2.

I created a very simplistic script that deletes records from my target table one day at a time for the target month, then does a commit, then the next day, etc.

Problem with this is I keep getting an error - The Transaction Log is Full. Up until today, people helping me with this have told me to increase my log file size. I don't think that's the answer - I now have 50 primary log files with log size = 50000... and I still get the Log File Full message.

Any suggestions?
 
Bunzo99,

Are you committing the changes? This will free up the logfile and allow you to start again. There's a couple of threads where this is discussed. Have a look at thread178-1140902 or thread178-641780

Marc
 
My simplistic script does a commit after deleting each days worth of records... as stated in my post. The script consists of 31 deletes and 31 commits...

Delete from mytable where month=<targetMonth> and day=1;
commit;
Delete from mytable where month=<targetMonth> and day=2;
commit;
.... etc.

Very simplistic at this time. I need to resolve the Log Full situation before adding bells/whistles.
 
bunz099,

increasing the log size isn't bad advice. However in your instance as you're only committing full days, your stll filling up the logs. Can you put an extra predicate on your delete statements to make them more granular? For example a range key or timestamp?

You don't say what your platform is. I've had much faster results on Z/Os, unloading, sorting and reloading, with very large numbers of partitioned rows, rather than running maintenance delete scripts against them.

Cheers
Greg
 
Greg,

Based on what I've read and been told to date, doing a commit should clear the log files. So, I'm trying to do a limited number of deletes between commit statements. After increasing my number of primary log files to 50 x 50000, my latest delete script has been running for over 5hrs and no idea yet as to whether this is working or not. I am seeing success messages in my db2diag.log.

As for my platform... I'm running DB2 UDB v7.2 under Windows 2000 Server.

I guess I'm simply baffled at the amount of log file space I need to support such a database as I don't see much documentation on log file size calculation and/or management.
 
I've had much faster results on Z/Os, unloading, sorting and reloading, with very large numbers of partitioned rows
Can you be more specific. Are you saying your sorting after you unload? How?
 
ddiamond,

without giving full spec. Essentially table's were partitioned into 64 partitions. Ran in parrallel 63 unload jobs, unloading all data except for that which is now expiring (> 24 months old). Had 64 input files into my sort, 63 from the unload and the other one being the new data. Output from the sort was 64 files, in partioning index order. 64 load jobs ran in parrallel against all of these files to load the database. This resulted in very high row counts being housekept and loaded.The load/unload part was using BMC products, which when the DBA flicked the correct switches ran amazingly well.

Greg
 
bunz099,

in the apporach you have described you're not doing a limit number of deletes, your doing millions. You have coded up a few delete statements, but each one is deleting millions of rows. Essentially what happens is as you've not got a commit between these millions of rows, which get deleted for each day is as follows. Each row you delete is written to the log files, which is why you have had to increase them. As you've got no comits the log files are simply being held open and many/all are being used. They are in this state because if your transaction fails, then DB2 has to read the log files, to put the database back to where it was, by undoing all the work your transaction has done.

My suggestion to you is along the lines of extending your predicates and therefore the number of delete statements you issus along the lines of

delete records where Month(datefield) = Month(Current Date) - 2 and prime_key < 50000 .

Commit

delete records where Month(datefield) = Month(Current Date) - 2 and prime key between > 50000 and <100000

Commit

and so on.


that way you'd at least be committing every 50,000 rows, rather than however million you are committing at now. Obviously my prime_key is by way of an example, but hopefully oyu can find some equivalent range delimiter within your data?

Cheers
Greg
 
If you are not concerned with being able to retreive the records from the logs, why not just turn of the loging processing during the deletes? That way each delete is not written to the log then.

Turn off logging
delete
commit (if you want)
delete
commit
delete
...
...
...
turn logging back on.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top