Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SQL 2005 and cluster crush eventid 19019 and 1069

Status
Not open for further replies.

pviqueira

Instructor
Aug 23, 2007
15
0
0
ES
Hello,

I have a problem with a sql 2005 SP2 cluter. I have several sql 2005 errors in aplication and a few later an error in system log about cluster service.
This is the cluster:

*clusternode1: HP DL50G5. 4 P dual core with HTT. W2003 R2 EE SP2 x64.
*clusternode2: the same as clusternode1.
I hace tried to solve the problem with this actions:

*Disable the ilo driver.
*Update NICs drivers.
*Create a DWORD SynAttackProtect with value 0.
¿Any idea to solve the problem?

This is the system event log during the error:

Event Type: Error
Event Source: ClusSvc
Event Category: Failover Mgr
Event ID: 1069
Date: 9/6/2007
Time: 3:02:57 PM
User: N/A
Computer: CLUSTERNODO2
Description:
Cluster resource 'SQL Server' in Resource Group 'MSSQL' failed.

This is the apication event log during de error:

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 2746; message = [Microsoft][SQL Native Client]TCP Provider: An existing connection was forcibly closed by the remote host.



For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 2746; message = [Microsoft][SQL Native Client]Communication link failure


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] OnlineThread: QP is not online.


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:02:09 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Warning
Event Source: MSDTC Client
Event Category: MSDTC Proxy
Event ID: 4359
Date: 9/6/2007
Time: 3:02:49 PM
User: N/A
Computer: DBGRP64
Description:
MS DTC is unable to communicate with MS DTC on a remote system. MS DTC on the primary system established an RPC binding with MS DTC on the secondary system. However, the secondary system did not create the reverse RPC binding to the primary MS DTC system before the timeout period expired. Please ensure that there is network connectivity between the two systems. Error Specifics:d:\nt\com\complus\dtc\dtc\cm\src\iomgrsrv.cpp:1318, Pid: 4276
No Callstack,
CmdLine: "C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Binn\sqlservr.exe" -sMSSQLSERVER

For more information, see Help and Support Center at Data:
0000: 43 00 4c 00 55 00 53 00 C.L.U.S.
0008: 54 00 45 00 52 00 53 00 T.E.R.S.
0010: 41 00 50 00 36 00 34 00 A.P.6.4.
0018: 00 00 ..


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:03:08 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] ODBC sqldriverconnect failed


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:03:08 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] checkODBCConnectError: sqlstate = 08001; native error = 0; message = [Microsoft][SQL Native Client]Unable to complete login process due to delay in opening server connection


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/6/2007
Time: 3:03:08 PM
User: N/A
Computer: CLUSTERNODO2
Description:
[sqsrvres] OnlineThread: Error connecting to SQL Server.


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......


Thanks
pablo
 
What's in the ERRORLOG?

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Hello,

The error in aplication log start at:
----------------------------------------------
Event Type: Error
Event Source: MSSQLSERVER
Event Category: (3)
Event ID: 19019
Date: 9/7/2007
Time: 12:55:01 AM
User: N/A
Computer: CLUSTERNODO1
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at Data:
0000: 4b 4a 00 40 01 00 00 00 KJ.@....
0008: 08 00 00 00 44 00 42 00 ....D.B.
0010: 47 00 52 00 50 00 36 00 G.R.P.6.
0018: 34 00 00 00 00 00 00 00 4.......
----------------------------------------
and cluster service:

Event Type: Error
Event Source: ClusSvc
Event Category: Failover Mgr
Event ID: 1069
Date: 9/7/2007
Time: 12:55:11 AM
User: N/A
Computer: CLUSTERNODO1
Description:
Cluster resource 'SQL Server' in Resource Group 'MSSQL' failed.

For more information, see Help and Support Center at ----------------------------------------------

this the MSQL error log after crash the service cluster and MSQL service:


2007-09-07 00:55:28.03 Server Microsoft SQL Server 2005 - 9.00.3186.00 (X64)
Aug 11 2007 05:31:24
Copyright (c) 1988-2005 Microsoft Corporation
Enterprise Edition (64-bit) on Windows NT 5.2 (Build 3790: Service Pack 2)

2007-09-07 00:55:28.03 Server (c) 2005 Microsoft Corporation.
2007-09-07 00:55:28.03 Server All rights reserved.
2007-09-07 00:55:28.03 Server Server process ID is 236.
2007-09-07 00:55:28.03 Server Authentication mode is MIXED.
2007-09-07 00:55:28.03 Server Logging SQL Server messages in file 'H:\MSSQL.1\MSSQL\LOG\ERRORLOG'.
2007-09-07 00:55:28.03 Server This instance of SQL Server last reported using a process ID of 5052 at 9/7/2007 12:54:33 AM (local) 9/6/2007 10:54:33 PM (UTC). This is an informational message only; no user action is required.
2007-09-07 00:55:28.03 Server Registry startup parameters:
2007-09-07 00:55:28.03 Server -d H:\MSSQL.1\MSSQL\DATA\master.mdf
2007-09-07 00:55:28.03 Server -e H:\MSSQL.1\MSSQL\LOG\ERRORLOG
2007-09-07 00:55:28.03 Server -l H:\MSSQL.1\MSSQL\DATA\mastlog.ldf
2007-09-07 00:55:28.03 Server SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.
2007-09-07 00:55:28.03 Server Detected 16 CPUs. This is an informational message; no user action is required.
2007-09-07 00:55:28.03 Server Large Page Extensions enabled.
2007-09-07 00:55:28.03 Server Large Page Granularity: 2097152
2007-09-07 00:55:28.03 Server Large Page Allocated: 32MB
2007-09-07 00:55:28.29 Server Using locked pages for buffer pool.
2007-09-07 00:55:28.39 Server Processor affinity turned on: processor mask 0x000000000000000f. Threads will execute on CPUs per affinity mask/affinity64 mask config option. This is an informational message; no user action is required.
2007-09-07 00:55:28.39 Server Using dynamic lock allocation. Initial allocation of 2500 Lock blocks and 5000 Lock Owner blocks per node. This is an informational message only. No user action is required.
2007-09-07 00:55:28.39 Server Lock partitioning is enabled. This is an informational message only. No user action is required.
2007-09-07 00:55:28.40 Server Attempting to initialize Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.
2007-09-07 00:55:29.62 Server Attempting to recover in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.
2007-09-07 00:55:29.62 Server Database mirroring has been enabled on this instance of SQL Server.
2007-09-07 00:55:29.64 spid5s Starting up database 'master'.
2007-09-07 00:55:29.73 spid5s Recovery is writing a checkpoint in database 'master' (1). This is an informational message only. No user action is required.
2007-09-07 00:55:29.84 spid5s SQL Trace ID 1 was started by login "sa".
2007-09-07 00:55:29.86 spid5s Starting up database 'mssqlsystemresource'.
2007-09-07 00:55:29.87 spid5s The resource database build version is 9.00.3186. This is an informational message only. No user action is required.
2007-09-07 00:55:30.39 spid5s Server name is 'DBGRP64'. This is an informational message only. No user action is required.
2007-09-07 00:55:30.39 spid9s Starting up database 'model'.
2007-09-07 00:55:30.39 spid5s The NETBIOS name of the local node that is running the server is 'CLUSTERNODO1'. This is an informational message only. No user action is required.
2007-09-07 00:55:30.54 spid9s Clearing tempdb database.
2007-09-07 00:55:30.59 Server A self-generated certificate was successfully loaded for encryption.
2007-09-07 00:55:30.61 Server Server is listening on [ 192.168.1.117 <ipv4> 1433].
2007-09-07 00:55:30.61 Server Server local connection provider is ready to accept connection on [ \\.\pipe\SQLLocal\MSSQLSERVER ].
2007-09-07 00:55:30.61 Server Server named pipe provider is ready to accept connection on [ \\.\pipe\$$\DBGRP64\sql\query ].
2007-09-07 00:55:30.62 Server The SQL Network Interface library could not register the Service Principal Name (SPN) for the SQL Server service. Error: 0x2098, state: 15. Failure to register an SPN may cause integrated authentication to fall back to NTLM instead of Kerberos. This is an informational message. Further action is only required if Kerberos authentication is required by authentication policies.
2007-09-07 00:55:30.62 Server SQL Server is now ready for client connections. This is an informational message; no user action is required.
2007-09-07 00:55:30.65 spid13s Starting up database 'msdb'.
2007-09-07 00:55:30.65 spid12s Starting up database 'JEP'.
2007-09-07 00:55:30.75 spid12s Analysis of database 'JEP' (5) is 100% complete (approximately 0 seconds remain). This is an informational message only. No user action is required.
2007-09-07 00:55:30.86 spid9s Starting up database 'tempdb'.
2007-09-07 00:55:30.90 spid14s The Service Broker protocol transport is disabled or not configured.
2007-09-07 00:55:30.90 spid14s The Database Mirroring protocol transport is disabled or not configured.
2007-09-07 00:55:30.92 spid14s Service Broker manager has started.
2007-09-07 00:55:32.06 spid5s Recovery of any in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC) has completed. This is an informational message only. No user action is required.
2007-09-07 00:55:32.06 spid5s Recovery is complete. This is an informational message only. No user action is required.
2007-09-07 00:55:32.78 spid52 Configuration option 'Agent XPs' changed from 0 to 1. Run the RECONFIGURE statement to install.
2007-09-07 00:55:33.25 spid52 Using 'xpsqlbot.dll' version '2005.90.3042' to execute extended stored procedure 'xp_qv'. This is an informational message only; no user action is required.
2007-09-07 00:56:02.28 spid52 Using 'xpstar90.dll' version '2005.90.3186' to execute extended stored procedure 'xp_instance_regread'. This is an informational message only; no user action is required.
2007-09-07 00:56:02.42 spid52 Using 'xplog70.dll' version '2005.90.3042' to execute extended stored procedure 'xp_msver'. This is an informational message only; no user action is required.
2007-09-07 04:56:17.09 spid51 Configuration option 'Agent XPs' changed from 1 to 0. Run the RECONFIGURE statement to install.
2007-09-07 04:56:18.22 spid14s Service Broker manager has shut down.
2007-09-07 04:56:19.34 spid5s SQL Server is terminating in response to a 'stop' request from Service Control Manager. This is an informational message only. No user action is required.
2007-09-07 04:56:19.34 spid5s SQL Trace was stopped due to server shutdown. Trace ID = '1'. This is an informational message only; no user action is required.


Thanks
Pablo
 
SQL Server doesn't appear to have any problems. I'm not sure why the cluster thinks that the SQL Server has crashed.

It appears that the SQL Server is simply not responding to the heartbeat that the cluster is sending out, so it's stopping the services and failing over. Are your applications reporting that they are unable to connect to the SQL Server before the failure occures? It seams like the SQL Server is simply locking up, posibily do to the SQL Server being to busy to respond.

Are you getting high CPU on the active node just before the failover?

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Thanks mrdenny,

I unisntall SQL2005 and cluster service and I had not network problems, but when I installed cluter service I started to have a strange behaviour.

**Ping clusternode1 (192.168.1.111, 10.10.10.1) to itself. It lost 564 packets:

Ping statistics for 127.0.0.1:
Packets: Sent = 120977, Received = 120413, Lost = 564 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 4390ms, Average = 16ms

Ping statistics for 192.168.1.111:
Packets: Sent = 121980, Received = 121416, Lost = 564 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 4390ms, Average = 16ms

Ping statistics for 10.10.10.1:
Packets: Sent = 121979, Received = 121415, Lost = 564 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 4390ms, Average = 16ms

***Ping from clusternode1 to clusternode2 (192.168.1.112,10.10.10.2). It did not lose packets:

Ping statistics for 192.168.1.112:
Packets: Sent = 125505, Received = 125505, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 7ms, Average = 0ms

Ping statistics for 10.10.10.2:
Packets: Sent = 125502, Received = 125502, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 8ms, Average = 0ms

Ping from clusternode1 to another server. It did not lose packets:

Ping statistics for 192.168.1.137:
Packets: Sent = 125508, Received = 125508, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 12ms, Average = 0ms

It looks a cluster problem???. So I continue this thead in the cluster's section.

Thanks
pablo
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top