VSS not clearing transaction logs

Sambooka · Feb 19, 2013

Hi

we have a two node dag. One of the nodes had a drive fill up to 100% because of the IOS6.1 calendar issues.

We flipped the store to the second node (with a larger disk) and resolved the IOS6.1 issue. Replication to the secondary node failed because of lack of diskspace so we suspended the replication. At no time did the store go offline.

We did a full VSS backup to truncate the logs. Backup was successfull but the logs were not truncated. Could this be because of the secondary DAG member being in a failed and suspended state? We cant restart replication because there is not enough space on the secondary member for all the transaction logs.

We are in a bit of a bind... thoughts?

Thanks

Sambooka · Feb 20, 2013

Found a MSKB that say yes.. that is perfectly normal that they dont get cleared if there is a suspended/failed DAG replication present.

ntinlin · Feb 22, 2013

Correct. It's a bit of a pain during DR tests as well if you cut your network link for instance.
It's part of the reason you always want to leave a LOT of wiggle room for logs. I've seen runaway logs due to calendar problems for instance too many times, fewer on 2010 than 2007 though due to the inbuilt throttling though.

In your situation the best bet might be to remove the replica(s), do another backup to truncate the logs and then recreate the replica. Or add disk space to the other node of course.

Your DAG nodes should really be the same in all respects when it comes to the space for the databases and logs, although you may have extra on one I suppose for a recovery database. We just tend to add a LUN for that when required.

Neill

Sambooka · Feb 22, 2013

Thanks Neill.

What a PITA it was .. had to call MS support. Had to break the replica... make sure all the other DBs were also on the same node as the backup (some VSS writer issue) and run the backup. That worked. After we couldnt recreate the replica, had to manually copy the log files to the passive node and then force a reseed of the ContentIndex.

Ugh..

ntinlin · Feb 25, 2013

Sounds like such fun.

DAG's are wonderful in my opinion but they do throw up little issues like this.

Have you taken steps to balance the disk space between the nodes?

Neill

fizelsche · Feb 25, 2013

Yeah,

but this problem is really easy to handle..

Sambooka · Feb 26, 2013

The primary and secondary are still on different size disks for now.
The primary is on the smaller of the two and we are at about 60% of the smaller.

Short story long in case anyone is interested.
1 user (1 out of 40) had the phone ios6.1 calendar bug and it filled the disk with transaction logs.
The disk could not be expanded so we expanded the disk for the secondary and did a failover until we could find the user and kill off the process that was generating the logs.By this point the original primary was full 100% and replication of the db was failed and suspended.

Needed to clear the transaction logs but couldnt do that while replication was failed and suspended (didnt know that)
Deleted the replication connector, moved ALL other DBs to the same node (which is normally the secondary) and did a backup. Had to move all dbs because of some VSS writer bugginess (You cant back up a primary node and flush logs if that node is also a secondary for a different db).

Once that is done we tried to recreate and reseed the replica but it would always fail saying it was missing a log file. Confirmed the file was there so we dismounted the db.. manually copied all the log files over and remounted the db. Not done yet.

The replica came up as healthy but when we tried to activate it it said Nope! I am my Indexing Status is currently Crawling. Waited 12 hours and still crawling. So called MS they said hmmmm.. shouldnt take that long. Lets reseed just the Content index. Did it from shell (dont remember the command but it is googleable) and it took less a few minutes.

So, Primary DB is up and running. Secondary is Healthy/Healthy. Tried activating the secondary: IT WORKED! Failed all DBs to the Primary node. Installed updates and rebooted secondary. Success. Failed everyone over to primary: Installed updates and rebooted:Success. Full backups of all DBs. Success.

Hope someone googles this and finds it useful

Sambooka · Feb 26, 2013

I dont see the edit button so...

I THINK the problem with recreating the replica was the fact that I did a full backup first. Someone in another forum mentioned that they had a similar problem. If MS hadnt suggested just reseeind the content index I would have tried cleanly dismounting the DB. Remove the transaction logs, remount the db and immediately create a new DB copy.

ShackDaddy · Feb 26, 2013

I agree with that last solution. Sometimes it's best to just dismount and start fresh with new logs. No reason to try and "handle" all those crap iOS-generated logs.

Dave Shackelford
ThirdTier.net
TrainSignal.com

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

VSS not clearing transaction logs

Sambooka

IS-IT--Management

Sambooka

IS-IT--Management

ntinlin

IS-IT--Management

Sambooka

IS-IT--Management

ntinlin

IS-IT--Management

fizelsche

Technical User

Sambooka

IS-IT--Management

Sambooka

IS-IT--Management

ShackDaddy

MIS

Similar threads

Part and Inventory Search

Sponsor