Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Communication link failure during merge replication

Status
Not open for further replies.

jcaulder

Programmer
Apr 22, 2002
241
US
Using Sql Server 2000 as publisher with 35 MSDE subcribers with merge replication. One of the subscribers has been replicating fine for months but recently we started getting failures consistenly stating: "The process could not deliver delete(s) at the 'Subscriber'." Looking at the error details related to this failure, it is always a "Communication link failure" but I cannot seem to find much about this error.

I can ping the subscriber with no problem and pinging with a load of 1 MB seemed to succeed albeit with high latency.

This error seems to imply a hardware or networking problem. I have been able to synchronize about 5 out of 60 times over the past two days although I'm not convinced if fully synched even though it reported success. Using profiler at both publisher and subscriber, everything appears to be going fine until the sudden failure with everything aborted.

Anyone have any information about a possible cause of this error? It appears to be failing at the following step according to the error details although profiler did not seem to reveal this:

{?=call sp_MSdelsubrows(?,?,?,?,?,?,?,?)}

Thanks!
 
Just thought I'd pass along what ended up being the solution for me in case it helps someone else. Seems the "communication link failure" at least in my case had nothing to do with hardware or networking. Sql server was reporting errors in the event log on the subscriber machine.

These errors lead me to run an integrity check on the database using DBCC CHECKDB. This returned 85 consistency errors. I then ran DBCC CHECKDB(REPAIR_REBUILD) to try and resolve. This fixed all but 3 of the errors. Those remaining were in MSMerge_Contents, a critical table for replication. I re-ran repair multiple times but each time I did so, I still received 2 or 3 errors with the offending rowguid changing each time. This lead me to believe that the indexes were not being fully rebuilt during the repair process for this system table.

So I ran reindex manually on the table using DBCC REINDEX(MSMerge_Contents). This table had four indexes. They rebuilt fairly quickly. I then checked the database again using DBCC CHECKDB and all errors were gone.

I took the database out of single user mode and restarted the replication job. The database was then able to successfully synch.

Hope this helps someone else.

J
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top