Hi,
We have a three node Exchange Server 2003 SP1 cluster running on Windows Server 2003 R2 Enterprise Edition with SP2. The cluster is in an active/active/passive setup with two EVS'es: EX001 and EX002. Symantec Mail Security for MS Exchange 5.0 is also running on the cluster.
EVS EX002 is being failed over every morning around 8:00. The following errors are logged three times and then a failover is performed (succesfully by the way):
Event Type: Error
Event Source: MSExchangeCluster
Event Category: Services
Event ID: 1005
Date: 19-6-2008
Time: 8:07:08
User: N/A
Computer: CLN002
Description:
Exchange Information Store Instance (EX002): The IsAlive check for this resource failed.
For more information, click Data:
0000: 8c 13 00 00 ?...
Event Type: Error
Event Source: MSExchangeCluster
Event Category: Services
Event ID: 1012
Date: 19-6-2008
Time: 8:07:08
User: N/A
Computer: CLN002
Description:
Exchange Information Store Instance (EX002): The RPC call to the service to take the resource offline failed.
For more information, click Data:
0000: 00 00 00 00 ....
Event Type: Error
Event Source: MSExchangeCluster
Event Category: Services
Event ID: 1011
Date: 19-6-2008
Time: 8:07:08
User: N/A
Computer: CLN002
Description:
Exchange Information Store Instance (EX002): The RPC call to the service to bring the resource online failed.
For more information, click Data:
0000: 01 00 00 00 ....
Event Type: Error
Event Source: MSExchangeCluster
Event Category: Services
Event ID: 1003
Date: 19-6-2008
Time: 8:07:08
User: N/A
Computer: CLN002
Description:
Exchange Information Store Instance (EX002): Failed to bring the resource online.
For more information, click Data:
0000: 01 00 00 00 ....
[/color red]
The EVS failed over to node CLN003. Tomorrow morning these errors will occur on CLN003 and a succesfull failover to cluster node CLN002 will be performed.
I also checked the cluster.log on CLN002 and see the following errors:
000016fc.000019d0::2008/06/19-06:06:58.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES]EcStoreIsAlive RPC exception occurred in call to EcAdminStoreGetVMStatus() with status code 1723 (0x6bb).
000016fc.000019d0::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES]EcStoreIsAlive RPC exception occurred in call to EcAdminStoreGetVMStatus() with status code 1723 (0x6bb).
000016fc.000019d0::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES]EcStoreIsAlive got RPC_S_SERVER_TOO_BUSY countinuously for more then 300 seconds. Assume that the store is dead.
000016fc.000019d0::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] EcStoreIsServerAlive() returned error 0, fIsAlive=FALSE
000016fc.000019d0::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] ExchangeCheckIsAlive: DwStoreIsAlive failed with status 5004.
000016fc.000019d0::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] DwGetIsAliveTimeout: DumpProcessOnIsAliveFailure not set.
000016fc.000019d0::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] Thread exited: DwMonitorSecondaryThread. Returning error 5004.
000016fc.000019d0::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] DwResourceThread::decrement:: g_lThreadCount=9.
000016fc.000019d0::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] EXCHANGE_RESOURCE::Release: Count=2.
000016fc.000019cc::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] DwMonitorPrimaryThread: secondary thread failed. Error 5004.
000016fc.000019cc::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] EventLogging: Exchange Information Store Instance (EX002): The IsAlive check for this resource failed. Error Code: 5004.
000016fc.000019cc::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] Thread exited: DwMonitorPrimaryThread. Returning error 5004.
000016fc.000019cc::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] DwResourceThread::decrement:: g_lThreadCount=8.
000016fc.000019cc::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] EXCHANGE_RESOURCE::Release: Count=1.[/color red]
We have been having these issues for quite some time and they always occur after the cluster has been rebooted. This was done last monday evening after applying some Windows security hotfixes. If I bring down the cluster in a controlled fashion, the problem will disappear again but surely there must be something wrong with it.
Any ideas anyone????
Thanks!!
Jeffrey Kusters
MCSA, MCSE, CCNA, CCNP, VCP-310
We have a three node Exchange Server 2003 SP1 cluster running on Windows Server 2003 R2 Enterprise Edition with SP2. The cluster is in an active/active/passive setup with two EVS'es: EX001 and EX002. Symantec Mail Security for MS Exchange 5.0 is also running on the cluster.
EVS EX002 is being failed over every morning around 8:00. The following errors are logged three times and then a failover is performed (succesfully by the way):
Event Type: Error
Event Source: MSExchangeCluster
Event Category: Services
Event ID: 1005
Date: 19-6-2008
Time: 8:07:08
User: N/A
Computer: CLN002
Description:
Exchange Information Store Instance (EX002): The IsAlive check for this resource failed.
For more information, click Data:
0000: 8c 13 00 00 ?...
Event Type: Error
Event Source: MSExchangeCluster
Event Category: Services
Event ID: 1012
Date: 19-6-2008
Time: 8:07:08
User: N/A
Computer: CLN002
Description:
Exchange Information Store Instance (EX002): The RPC call to the service to take the resource offline failed.
For more information, click Data:
0000: 00 00 00 00 ....
Event Type: Error
Event Source: MSExchangeCluster
Event Category: Services
Event ID: 1011
Date: 19-6-2008
Time: 8:07:08
User: N/A
Computer: CLN002
Description:
Exchange Information Store Instance (EX002): The RPC call to the service to bring the resource online failed.
For more information, click Data:
0000: 01 00 00 00 ....
Event Type: Error
Event Source: MSExchangeCluster
Event Category: Services
Event ID: 1003
Date: 19-6-2008
Time: 8:07:08
User: N/A
Computer: CLN002
Description:
Exchange Information Store Instance (EX002): Failed to bring the resource online.
For more information, click Data:
0000: 01 00 00 00 ....
[/color red]
The EVS failed over to node CLN003. Tomorrow morning these errors will occur on CLN003 and a succesfull failover to cluster node CLN002 will be performed.
I also checked the cluster.log on CLN002 and see the following errors:
000016fc.000019d0::2008/06/19-06:06:58.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES]EcStoreIsAlive RPC exception occurred in call to EcAdminStoreGetVMStatus() with status code 1723 (0x6bb).
000016fc.000019d0::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES]EcStoreIsAlive RPC exception occurred in call to EcAdminStoreGetVMStatus() with status code 1723 (0x6bb).
000016fc.000019d0::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES]EcStoreIsAlive got RPC_S_SERVER_TOO_BUSY countinuously for more then 300 seconds. Assume that the store is dead.
000016fc.000019d0::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] EcStoreIsServerAlive() returned error 0, fIsAlive=FALSE
000016fc.000019d0::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] ExchangeCheckIsAlive: DwStoreIsAlive failed with status 5004.
000016fc.000019d0::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] DwGetIsAliveTimeout: DumpProcessOnIsAliveFailure not set.
000016fc.000019d0::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] Thread exited: DwMonitorSecondaryThread. Returning error 5004.
000016fc.000019d0::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] DwResourceThread::decrement:: g_lThreadCount=9.
000016fc.000019d0::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] EXCHANGE_RESOURCE::Release: Count=2.
000016fc.000019cc::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] DwMonitorPrimaryThread: secondary thread failed. Error 5004.
000016fc.000019cc::2008/06/19-06:07:08.514 ERR Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] EventLogging: Exchange Information Store Instance (EX002): The IsAlive check for this resource failed. Error Code: 5004.
000016fc.000019cc::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] Thread exited: DwMonitorPrimaryThread. Returning error 5004.
000016fc.000019cc::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] DwResourceThread::decrement:: g_lThreadCount=8.
000016fc.000019cc::2008/06/19-06:07:08.514 INFO Microsoft Exchange Information Store <Exchange Information Store Instance (EX002)>: [EXRES] EXCHANGE_RESOURCE::Release: Count=1.[/color red]
We have been having these issues for quite some time and they always occur after the cluster has been rebooted. This was done last monday evening after applying some Windows security hotfixes. If I bring down the cluster in a controlled fashion, the problem will disappear again but surely there must be something wrong with it.
Any ideas anyone????
Thanks!!
Jeffrey Kusters
MCSA, MCSE, CCNA, CCNP, VCP-310