lsass error -- hardware or virus?

MIScoord · Sep 19, 2002

I just went through one of those weekends that every systems administrator dreads -- a server disaster that has no explanation.

We are a small domain -- eight subnets, three controllers, about 100 clients. We're W2K all around on our servers and our primary master also functions as our exchange box (I know, I know, but we're a non-profit and money is tight). Friday afternoon we had a power failure, but I was able to get both of our in-house servers (primary master and firewall) shut down in an orderly fasion before the UPC ran out. When the power came back on, the firewall server came back up fine, but the DC would give me an lsass error before I ever got to a login screen. I tried booting to directory services restore mode, but no password that I had would work. Not the DS restore password, not administrator, not anything. Every other disagnostic mode of W2K would return the lsass error.

At this point, we got Microsoft involved, and the first engineer we talked to had us reinstall Windows to a new directory, join as a member server, and promote to a domain controller, copying the AD from one of our other DCs. That worked fine until we rebooted after running DCPROMO. Same lsass error before a login screen, no joy on the DS restore password (which, by the way, we took great care to make we had correct). We're going a little nuts by now, of course. So we reinstalled Windows yet again, and this time, we got an lsass error on reboot before we ever promoted to a DC. So, the engineer concluded that the SAM database on every reinstall was getting corrupted, which was crapping out our passwords. So, he recommended that we reinstall long enough to get the data off the server (we had a backup from Thursday night, but there had been significant work and e-mail activity done on Friday before the power failure that we didn't want to lose). Since this is a RAID 5 server, he thought we should blow away the containers, rebuild them and reinstall W2K from scratch. He said we had either hardware failure or a nasty boot sector virus. If it was a virus, the rebuild and reformat would take care of it. If is was hardware, it would manifest itself again, and then we could the manufacturer involved.

Fine. But here's the weird part. Our own virus scanner (up-to-date signature files) had found nothing before the crash, and a run of trendmicro's housecall off the web revealed nothing either. Once we rebuilt the containers and started over, though, everything worked perfectly. No fuss, no muss. We got all the exchange data back on Sunday, had most of the important files copied over Sunday night, and were back up and running as an organization Monday. It's been perfect since. It's hard to explain, but the server just "feels" better now. Response time from clients seems to be better, the console to the server itself is more responsive.

So my question now is, what happened? I was under the impression that boot sector viruses could only be propogated via floppy, and we keep our servers in a locked room to which I only have a key. We run CA's eTrust virus scan and it never saw a thing. I really doubt that's what it was. But if it's hardware, am I just waiting for the other shoe to drop now, or could there have been some sort of hiccup in the initial build of the containers 19 months ago that just deteriorated over time? If anyone has seen a similar problem, I'd love to hear their thoughts. Me: We need a better backup system.
My boss's boss: Backup? We don't need no stinkin' backup!

GlenJohnson · Sep 19, 2002

I can't tell you what happened, but I've seen this issue before where a server works better after a re-boot. I've been having issues with a new server that will lock at 100% cpu and don't know why. I've been monitoring the cpu of the server from my desktop, and it normally is around 3 to 9%, but after a couple of days, it starts going up to 20 to 30%. Things start slowing down, it'll eventually lock at 100%, (Almost instantlly, not a gradual rise, goes from the 20 to 30% to 100% NOW!). Re-boot the server, everything is better. Don't know why, but I've seen it also. Glad it's better. Glen A. Johnson
Microsoft Certified Professional
glen@nellsgiftbox.com
[americanflag]

"Life is a succession of lessons which must be lived to be understood."
Ralph Waldo Emerson (1803 - 1882); US philosopher, poet, essayist.

jfinley · Aug 7, 2003

On W2K server, Port 445, microsoft-ds is sending and receiving alot of traffic. Is there anyway to stop it? Is this a sign that someone is using my server to send and receive packets?

GlenJohnson · Aug 7, 2003

Need more info. Thanks.

Glen A. Johnson
Johnson Computer Consulting
MCP W2K
glen@johnsoncomputers.us

Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884

"Once the game is over, the king and the pawn return to the same box."

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

lsass error -- hardware or virus?

MIScoord

MIS

GlenJohnson

MIS

jfinley

IS-IT--Management

GlenJohnson

MIS

Similar threads

Part and Inventory Search

Sponsor