Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Table Indexing Takes Over the Server 4

Status
Not open for further replies.

DarkMan

Programmer
Apr 13, 1998
222
0
0
US
I've got a table that acts as a log file for a web site. For every impression the site gets, the system adds a row to the db, which I can go to later and count impressions and unique ips. The system used to work okay, but we're starting to see some serious problems as the site has grown.

This table is now over 40 million records long. About 3-4 times a day, the index pages fill up and the server goes through a re-index. This process takes over the system and causes the db server to be unavailable for 5-10 minutes at a time. This is unacceptable (especially when this happens at 3 in the morning. The site is set to page me when we get errors. UGH!)

I'm working on a better method for tracking this stuff, but until then, there's a couple things I could do to help alleviate the situation.

First, we don't need all the records in the table. I could delete 75% of them (old records), but the server freaks out when I try to delete more than 1000 or so rows at once, and doing 30 million at that rate would take forever. Anyone know a way to bypass the automatic re-indexing while it's deleting, or just a fast and dirty way to delete 30 million records?

Also, is there a way to massage the indexes so they don't have to refresh themselves so often? Even if we could get the server to refresh the index only once a day, it would be a huge improvement, especially if we could enforce a specific time to do so.

Any ideas or steps in the right direction would be most appreciated.

Thanks,
DarkMan DarkMan
 
Hi,
I had a similar problem on an NT server and wrote this perl script to clean it up. If it works you should probably think about setting up a stored procedure to remove all logs greater than a certain date - delete sysfiles where datediff(day, TimeStamp, getdate()) > 20


Hope this helps

use File::Copy;
$Family="lexus";
$Dirpass="passed";
$Dirfail="failed";
$NetDrive="H:";
# set Time in days to move logs #####
$Days= 90; #
#####################################
$filedir="$NetDrive/log/$Family/$Dirpass/";
$filedir1="$NetDrive/log/$Family/$Dirfail/";
$countStop=300;

print "Checking for log over $Days Day(s)....\nPlease Wait ..\n";

opendir DIR, $filedir;
my @array = readdir DIR;
closedir DIR;

opendir DIR1, $filedir1;
my @array1 = readdir DIR1;
closedir DIR1;
$countStart=1;
foreach $files(@array)
{
$file="$filedir$files";

if (-T $file) # check to see if a text file?
{
$min = 1;
checkFile($min, $file);
}
else
{
next;
}
}

print "\nChecking failed directories ....\nPlease Wait...\n";
$countStart=1;
foreach $files(@array1)
{
$file="$filedir1$files";

if (-T $file) # check to see if a text file?
{
$min = 1;
checkFile($min, $file);
}
else
{
next;
}
}

sub checkFile
{
my ($min, $file) = @_;
$min *= (60*24*$Days);
my $t = time;
my @tmp = stat($file);
if($tmp[9] < ($t - $min*60))
{
print &quot;File over $Days Day(s) old Moving $file to Archive !\n&quot;;
move(&quot;$file&quot;,&quot;$NetDrive/qmds&quot;);
#unlink($file);
if ($countStart > $countStop){exit}
$countStart++;
return 1;
}
return 0;
}
 
You can turn toggle auto statistics with the following procedure

EXEC sp_autostats tablename , 'ON'

or

EXEC sp_autostats tablename, 'OFF'

Personally I never have the auto stats on as I prefer to handle that part of the DB maintenance myself where I can have more control.

I have a job, which runs once each evening in a quite period, which loops through all my tables, and runs the UPDATE STATISTICS command as required.

Rick.


 
Sorry forgot to mention you can also run DROP INDEX prior to your delete and then use CREATE INDEX afterwards.

Rick.
 
Thanks guys,

Unfortunately, I can't use perl scripts, but I'll turn the auto stats off right now, and try dropping and recreating the index before I try deleting the rows again. (I won't do that 'til sometime in the middle of the night, for obvious reasons..:)

I'll post here how it went.

DarkMan DarkMan
 
Another idea might be to archive the table each night and then truncate after the archive completes. I do this through a Package and schedule to run at 4am. The archive appends to a table in my Archive database.

HTH Ashley L Rickards
SQL DBA
 
I would leave AutoStats on especially if the table has lots of inserts during the day and you have to query using various criteria. If you turn stats update off you'll see a continual degradation in performance.

I believe the best answer to the problem is to reduce the number of records in the table. If you can Archive records, Ahsley's suggestion is good. If you don't need to retain an archive then just delete the unneeded rows.

The biggest problem with large DELETES (INSERTS and Updates, also) is that the LOG file grows large as SQL records each transaction in order to be able to ROLLBACK updates.

I usually set up large update processes to run in small batches. For example, I may delete 10000 records at a time and loop through the process. Each deelte statement would be enclosed in a transaction.

---------------------------

Example:

Declare @rowcnt int, @dd datetime

-- Determine the delete criteria
Set @dd=convert(char(11), dateadd(M,6,getdate()))

-- Limit number of rows deleted
-- in one statement to 10000.
Set Rowcount 10000

-- Determine if any rows can be deleted
Select @rowcnt=count(*)
From TblName Where TblDate<@dd

-- Set up loop using While statement
While @rowcnt>1
Begin

-- Begin a transaction
Begin Transaction

Delete TblName Where TblDate<@dd
Set @rowcnt=@@rowcount

-- Commit transaction to clear the LOG
Commit Transaction

End

---------------------------

Another solution which may be quicker in this instance is to INSERT the rows you want to retain into a temporary table, truncate the table and reinsert rows from the temporary table. Terry L. Broadbent
Programming and Computing Resources
 
Just a thought, depending on the index, could you adjust the fillfacter so the index takes longer to fill up? Unless you've already tried this..

-Eric
 
WOW! All of you had great suggestions I hadn't yet tried! (And I thought I had tried everything). Thanks to everyone. I think these will keep me out of trouble until I can re-do the system.

Thanks to everyone for your generous help with this, you've put me way ahead of the game. :)I DarkMan
 
Darkman,

If you are using SQL Server 2000, you can use DBCC INDEXDEFRAG to defragment indexes. INDEXDEFRAG has many advatages over DBREINDEX. It runs online and will not block processing. This means you can run it throughout the day.

There are some limitations to its effectiveness but it may provide a solution to the rebuild index problem. SQL BOL has lots of details about DBCC INDEXDEFRAG.

Let me know if this was of any help. Terry L. Broadbent
Programming and Computing Resources
 
tlbroadbent,

Thanks! I've not seen that one before. I'm running the defrag right now, and my site isn't dogging out at all... This ought to lengthen the time that I have to re update statistics...

You deserve another twenty stars for that one!

DM DarkMan
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top