Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

log file sync and log buffer waits

Status
Not open for further replies.

parisdba

MIS
Apr 20, 2002
21
AE


hi,

My problem again is with regard to redo log buffer. I just joined in a firm as a beginner and here the redo log buffer is seen to be set as 29 MB .But still the log space waits and more importantly the log file sync is seen to be increasing in an enormous rate. could anybody give me advice regarding tuning of this. Stats are given below. Commit rate for the application is very high

SQL> select name,value from v$sysstat where name like '%redo%';

NAME VALUE
------------------------------------------------------------redo synch writes 31769
redo synch time 0
redo entries 417554
redo size 128271040
redo buffer allocation retries 4
redo wastage 5873652
redo writer latching time 0
redo writes 32554
redo blocks written 270474
redo write time 0
redo log space requests 4
redo log space wait time 0
redo log switch interrupts 0
redo ordering marks 40

14 rows selected.



SQL> select SUBSTR(EVENT,1,27),TOTAL_WAITS T_W,TOTAL_TIMEOUTS T_T,
2 TIME_WAITED T_WD,AVERAGE_WAIT A_WAIT
3 from v$system_event where event like '%log%';

SUBSTR(EVENT,1,27) T_W Tot_T T_WD A_WAIT
--------------------- ---- ---- ----- -------
log file sequential read 16 0 0 0
log file single write 24 0 0 0
log file parallel write 32563 0 0 0
log file switch completion 4 0 0 0
log file sync 31942 11 0 0


thanx in advance for any help

 
Hi

1.) how often/frequent is your log file switched?
You will find this information in V_$LOGHIST.

2.) how long is your statistics interval? If you are doing it a whole day, you will probably get also the times when large imports are running overnight.

3.) What are the values for log_checkpoint_intervall
and log_checkpoint_timeout for your database in the V_$PARAMETER?

4.) How many db_writer_processes and how many dbwr_io_slaves do you have (to find in V_$PARAMETER also)?
 
hi,

1) The log switch happen every 10-15 min. When there is high volume of transactions then log switches happen very frequently i.e every 1 min or 1 and half min.

2)the interval of the stats were for 4 hours that is the peak of transactions.Now the log file sync wait in system event has reached 82721


3)the value of parameters are
log_checkpoint_interval = 10000
log_checkpoint_timeout = 1800


4) i think it is set to default. I have not found an entry for it in the init.ora file. nor in the v$parameter

thanks
 
I expect that you have archive logs enabled and parallel write in mirrored redo-logs?

It seems that your database writer processes have problems to cope with the redo-log information in these time periods.

Try to increase the number of dbwr and of their slaves.

The size of the redo logs seems ok for a production system, since the data lost when the redo-logs are corrupted will increase then. 15 min seem ok, on our systems we normally try to reach an intervall of 1-3 hours depending on the server. But the 1 min is quite short.

It is dependent what to prefer:

a) if you intend to have fast recovery you set checkpoints often.
b) if you intend to have a high performance and the risk of longer recovery times you set the checkpoints rarely.

In case b) set the log_checkpoint_time = 0 and the value of log_checkpoint_intervall to a value where
log_checkpoint_intevall*db_block_size > redo_log_size.

But nevertheless increasing the number of db-writers should help since you have only 1 process running when defaulting and this is a bit too stressed when you write a lot to disk an to redo and to archives at the same time :)
 
I'm not 100% convinced you actually have a problem. Reading the documentation of the "log file sync" wait event, it looks to me as if every single commit on your database will generate one log file sync wait. Oracle can't complete the commit until it writes the transaction to the redo log, hence it issues a wait until the write is finished. As long as the wait is brief there is no problem, and you don't give any numbers that would suggest lengthy wait times.

As far as I can see, the numbers you are reporting reflect normal activity on a busy database. Are you actually experiencing poor performance? If so, do you know for sure that log file syncs are the cause?
 
hi,

Thanks for your mails.

spenglerr, could you explain what happens when we set log_checkpoint_timeout to 0 . if my guess is right check point should not lag behind this value and so checkpoint should happen continously ?

As karluk said i am not sure whether the performance problem is the real cause but still there is a lot of waits happening in log space request and our log buffer being 29 Mb(which is considered to be huge.) Processor performance of the server goes at 100% durring peak times.I have read in many articles that sizes of above 2mb in log buffer is of no use but how can these log space request be reduced.
could you give me some advise on sizing log buffer.

Everything else seem to be going fine.Te library cahce ratio ,data dictionary cahce ratio and most of the other things i have gone through.even if log_buffer is problem i dont see how it affects the processor performance it should show as an io contention right?
 
Ok, first I agree with Karluk if you do not have a performance problem you will not have to think about this!

A checkpoint causes waits, since it is time consuming and the dbwr will have to wait until the checkpoint is completed before writing anything else.
So if you have a lot of checkpoints in your redo-log you will get a lot of time consuming log writes. Therefore if you have a problem with this you can configure your database to perform only one checkpoint at the end of the redo-log when the log switches. For this purpose you set the time intervall for writing checkpoints to zero to switch this possibility off. Also you set your checkpoint intervall to a value higher than the size of the redo log. Then only one checkpoint per redo-log will be written, BUT
this means that in case of recovery, the recovery will take longer and you have more dirty buffers actual at a certain time.

If you increase the number of database writer processes and/or slaves you have more parallel I/O and thus the writing can cope quickly.
From the start:

1) What operating system are you working on ? UNIX or Windows?

 

hi,

i am working on windows platform(2000 to be specific). how can we know the number of checkpoints happenning as i have not specified log_checkpoint_to_alert in the init file.
 

hi,

i am working on windows platform(2000 to be specific). how can we know the number of checkpoints happenning as i have not specified log_checkpoint_to_alert in the init file.

thanks for your time
 

hi,

yes 2 CPUs with 2 Gb Ram and Oracle files stored in a HP NetAPP Filer
 
hi,

Another addition to the questions posted above. Now a days I am getting high number of log switches durring the day and at peak time upto 2-3 logswitches a min and durring the entire night only 1 or 2 for the entire night. Even with 35M redo log file why is this so? Is it because my log buffer is huge or that there is concurrent transactions of 35M per 1/2 a minute. (Bare in mind that the log buffer is 29 M). How do i find out the number of checkpoints happening if i have not specified log_checkpoint_to_alert=true

Thanks for your replies
bye
 
Hi,

now we can help:

You have high log buffer sync waits and high log buffer parallel write waits. This means that your Log Writer has problems to come up with syncronizing the log buffers with the redo logs.
It looks like your application has high commit rates, which causes the log buffer to be flushed very frequently which inturn makes LGWR work furiously at writing logs. The best known way to help bring this wait period down is to reduce/minimise commit rate.

1.) Since you are telling that all Oracle Files are on a NetApp Filer: Is this running a RAID5? Are the redo logs also on a RAID5? If so move them back from the slow RAID5, since this is quite bad for the LGWR to come along.

Another tuning issue: put the redo logs on a disk with little or no other I/O.

Did you mirror the hard disks of your log-files, if not please do it, since you will get parallel write then on two disks.

2.) LGWR gets problems when you have a lot of rollbacks, this could also explain your high CPU-wastage -> Look forward to get your application to reduce rollbacks whenever possible.

3.) LGWR will get problems when you left tablespaces in HOTBACKUP-mode!? Since this causes higher REDO-generation.

4.) Try to use NOLOGGING whenever this is possible.

5.) Perhaps your log_buffer size is too big, then the LGWR will get too big chunks to write.

A combination of well placed and sized redo logs, a small log_buffer and tuning commit rates can rid you of most problems with redo generation.



If you want information if checkpointing is causing problems look at the statspack parameters:

log file switch (checkpoint incomplete)
log file switch (archiving needed)
log file switch/archive
log file switch (clearing log file)
log file switch completion
log switch/archive
log file sync

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top