Hi all
I've been experiencing an interesting issue with a couple of of merge publications on our SQL servers. The error I am seeing is in the distribution agent history is:
"The merge process could not perform retention-based meta data cleanup in database..."
It happens regularly throughout the data - eventually the data pushes through, but then it hangs again with the above message.
I researched the issue and found that the msmerge_contents and msmerge_tombstone tables were extremely large (between 7 and 10 million) on the publication & subscription databases experiencing the issues. We have other merge publications on the same server which are working perfectly - i checked the _contents and _tombstone tables for those databases and they were far smaller.
I also ran sp_who2 and found that a distribution spid was blocking user queries in the affected databases - dbcc inputbuffer revealed that the spid was running the sp_mergemetadataretentioncleanup procedure. I modified the agent profile to not perform metadata retention cleanup, restarted the agent, and sure enough the errors and blocking went away.
So...the question is how to clean up these tables so they no longer cause a problem. I've read that I can schedule sp_mergemetadataretentioncleanup manually out of hours. However if the procedure is not working via the agent profile will it not just cause the same problems when scheduled manually? It wont be as busy so that might help but the system is still 24x7 and we may experience blocking problems again.
Also, something interesting I found was that when i ran a profiler trace on the publisher, all calls to the procedure on the problem databases resulted in 0 writes, whereas all the other databases made several writes on each call... so it seems the sproc isnt actually doing anything on the prob dbs..........so would running the sproc manually actually do anything...the other thing is the procedure doesn't accept database name/id as a parameter, so how does it know what database to perform the clean up on??
The only safe way i can think of getting around this problem is by dropping the merge publications and subscribers, which should then empty the replication tables, and then reset up the publications and subscriptions. It will involve downtime but at least the tables will be cleaned out! If they aren't automatically cleaned up I assume I can just truncate the data.
Once all the publications are set up I will make sure the retention period is as low as it can be to keep the tables trim and hope that the automated retentioncleanup job does what it's supposed to.
Does anyone have a better way of getting around this? Any suggestions for how to avoid getting into this situation. I wonder why the tables were allowed to get so big in the first place, since the sproc should be running every minute?
Thanks!
Di.
I've been experiencing an interesting issue with a couple of of merge publications on our SQL servers. The error I am seeing is in the distribution agent history is:
"The merge process could not perform retention-based meta data cleanup in database..."
It happens regularly throughout the data - eventually the data pushes through, but then it hangs again with the above message.
I researched the issue and found that the msmerge_contents and msmerge_tombstone tables were extremely large (between 7 and 10 million) on the publication & subscription databases experiencing the issues. We have other merge publications on the same server which are working perfectly - i checked the _contents and _tombstone tables for those databases and they were far smaller.
I also ran sp_who2 and found that a distribution spid was blocking user queries in the affected databases - dbcc inputbuffer revealed that the spid was running the sp_mergemetadataretentioncleanup procedure. I modified the agent profile to not perform metadata retention cleanup, restarted the agent, and sure enough the errors and blocking went away.
So...the question is how to clean up these tables so they no longer cause a problem. I've read that I can schedule sp_mergemetadataretentioncleanup manually out of hours. However if the procedure is not working via the agent profile will it not just cause the same problems when scheduled manually? It wont be as busy so that might help but the system is still 24x7 and we may experience blocking problems again.
Also, something interesting I found was that when i ran a profiler trace on the publisher, all calls to the procedure on the problem databases resulted in 0 writes, whereas all the other databases made several writes on each call... so it seems the sproc isnt actually doing anything on the prob dbs..........so would running the sproc manually actually do anything...the other thing is the procedure doesn't accept database name/id as a parameter, so how does it know what database to perform the clean up on??
The only safe way i can think of getting around this problem is by dropping the merge publications and subscribers, which should then empty the replication tables, and then reset up the publications and subscriptions. It will involve downtime but at least the tables will be cleaned out! If they aren't automatically cleaned up I assume I can just truncate the data.
Once all the publications are set up I will make sure the retention period is as low as it can be to keep the tables trim and hope that the automated retentioncleanup job does what it's supposed to.
Does anyone have a better way of getting around this? Any suggestions for how to avoid getting into this situation. I wonder why the tables were allowed to get so big in the first place, since the sproc should be running every minute?
Thanks!
Di.