Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Information Store Shuts down during backups only reboot will fix??

Status
Not open for further replies.

Chris71

MIS
Mar 28, 2007
9
US
Ok, This is a really strange Problem kinda need some ideas. Googles hasn't really been helping me pinpoint this.

We have 3 Exchange 2003 / SP2 servers: 1 bridgehead and 2 Backend Exchange Servers. Last night when the backups were running there was an error and the Information Store went down and will not restart the service until I rebooted the server. It only happens on 1 of our 2 servers. Some of the We have plent of space on this email server about 400 GB free. (this has happens ever couple of months during backups)

Event IDs are: In order from the first error.

Event IDs: 482, 414, Source: ESE
9558, Source MSExchangeIS
492, 471, 481, 471, 481x2, Source: ESE
1005 Source Application Error
Then fails all night with 9175's

EVent 482
Information Store (3328) First Storage Group: An attempt to write to the file "E:\Program Files\Exchsrvr\mdbdata\E00.log" at offset 3504128 (0x0000000000357800) for 512 (0x00000200) bytes failed after 21 seconds with system error 2 (0x00000002): "The system cannot find the file specified. ". The write operation will fail with error -1811 (0xfffff8ed). If this error persists then the file may be damaged and may need to be restored from a previous backup.

Event 414
Information Store (3328) First Storage Group: Unable to write to section 0 while flushing logfile E:\Program Files\Exchsrvr\mdbdata\E00.log. Error -1811 (0xfffff8ed).

Event 9558
An error occurred while writing to the database log file of storage group "First Storage Group". Attempting to unmount all databases in this storage group.

Event 492
Information Store (3328) First Storage Group: The logfile sequence in "E:\Program Files\Exchsrvr\mdbdata\" has been halted due to a fatal error. No further updates are possible for the databases that use this logfile sequence. Please correct the problem and restart or restore from backup.

Event 471
Information Store (3328) First Storage Group: Unable to rollback operation #2100968 on database E:\Program Files\Exchsrvr\mdbdata\pub1.edb. Error: -510. All future database updates will be rejected.

Any help is appreciated.
Thanks
 
Are you using an Exchange agent to back up Exchange? It almost looks like a locking issue, where Exchange was trying to write to the log while it was in use. That might very well crash the store.
 
Were using Veritas Backup Exec 9.1 w/ the exchange add on. Backing up the mailboxes and shadow copy components only.
 
You don't have a file level AV client trying to scan the Exchange folders on the array, do you?

Pat Richard, MCSE MCSA:Messaging CNA
Microsoft Exchange MVP
Want to know how email works? Read for yourself -
 
Actually we have Trend Micro / Scan Mail 6 for exchange 2000 / 2003 There isn't much I can disable if i do it will not scan any of the email boxes. This is a good point, any ideas?
 
Sounds like a memory leak...
If it is a memory leak, I would create a scheduled task to stop and start Veritas services within an hour, everyday, just before the backup; I have to do this on a few servers.


........................................
Chernobyl disaster..a must see pictorial
 
If there's a memory leak, scheduling a task everyday is a band aid fix. The leak problem needs to be resolved.

If Trend Micro is scanning the \exchsrvr folder, there's a problem. Exclude it if it is. Make sure no anti-spyware software is scanning that as well.

Pat Richard, MCSE MCSA:Messaging CNA
Microsoft Exchange MVP
Want to know how email works? Read for yourself -
 
Basically I have 3 paths I can go down. These are the facts they are dependable hp servers with MX30 expanded Array controllers on ML360's:

1. But there was an error in the logs talking about the array controller.

Example:

Event 11

The driver detected a controller error on \Device\Scsi\cpqcissm2.

Event 9


The device, \Device\Scsi\cpqcissm2, did not respond within the timeout period.


Then these happen all night. Every couple of Seconds.


dmio: Harddisk1 write error at block 59119162: status 0xc000000e


For more information, see Help and Support Center at

Event 51

An error was detected on device \Device\Harddisk1 during a paging operation.

Incredibly there are no issues reported on the Array controller or the HP SIM / Insight Mgt, other than a possible array controller issue or a cable issue.

2. There is a possibility that it could be related to virus scanning but this has only happened 2 times in past 2+ months, basically 2 nights in a row. We also have been scanning email using Trend Micro for a long time now without issues.

3. We just replaced one of our robitic lib backup tape drives (2 per unit) and its seems unlikely however we didn't have problems until we brought the 2nd tape drive back into service. Also the event ID errors that were above happen before the backup jobs even start basically what happens is the IIS Store shuts down and then the backups cannot attach to the device to backup the mailboxes using the veritas remote agent.

Either way this is a problem thats for any replys.
Chris
 
Also the event ID errors that were above happen before the backup jobs even start basically what happens is the IIS Store shuts down and then the backups cannot attach to the device to backup the mailboxes using the veritas remote agent.

That's an entirely different deal then. You say that the event logs show the store stopping before the backup even starts, right? So we can probably eliminate the backups and maybe the antivirus, and look pretty much at the RAID subsystem.

Have to tried updating the drivers for the RAID subsystem? Resolving those controller issues may very well resolve your store crashing issues.
 
Actually did that tonight new firmware on the raid array, all the HP components updated to HP PSP7.70

This weekend, Iam going to dig into it more.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top