Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Random breakdown at random times? 1

Status
Not open for further replies.

Franky8

IS-IT--Management
Feb 14, 2008
15
DK
Hi,

as posted in an earlier message, I have gotten the honor of taking over the responsibility of my company servers.

The servers seem to breakdown randomly, both day and night, and some time its the exchange, and the next day its the file server. The servers cant be pinged. A restart usually fix the problem. No error messages as such.

The guy before me removed any unnecessary processes from the servers, but other than that he failed to find the problem.

We got the following running server 2003 standard:

1 DC
1 FILE server with backup exec 11d
1 EXCHANGE
1 CITRIX

and 1 phone runing server 2000

The guy before me have added 3 virtual servers using Wmware virtual servers, and I dont know much about the reason for that, other than he have written proxys next to them in the very limited documentation he left for me.

The bacup often crashes, I believe it happens when one of the servers hang during the nightly backup routines.

I know its a big question but do you guys have any idea what I should be looking for?

Thanx in advance.
 
I had a problem like this with my one 2000 server once it was upgraded the problem went away, so was the OS upgraded from 2000, if yes I would start with loading the latest firmware and hardware drivers, mainly BIOS, RAID and network card.
Don't know much about virtual servers other than I hate them, because if any hardware fails it's brings the whole lot down.
 
I forgot to mention that the servers have been runing perfect for 4 years prior to all these crashes..

I have been told that the only thing changed is windows updates, and trend micro antivirus updates..

 
And there is nothing in the event log? No script or scheduled task causing it?
Change you admin password so that the previous Admin can not login, and any other accounts he may have created to give himself access.
 
The guy who was in charge before me, told me that there were nothing in the eventlog indicating errors, but I just checked back to the last crash on the exchange server.

the eventlog is filled up with this error until restart, which occurs every second:

------
Unable to open LDAP session on directory 'DC' using port number 389. Directory returned the LDAP error:[0x1] Operations Error.

For more information, click ---

Does it make sence?
 
Also this one occurs as system error:

every second until restart of the system.

_____________

The server was unable to allocate from the system nonpaged pool because the pool was empty.

For more information, see Help and Support Center at
______________

Could this be the cause of the crash?
 
I know that Exch 5.5 defaulted to port 389 and needed to be changed to e.g. 390. Not sure what version your running and if it was fixed in later version's that being exch 2000 . Maybe someone else has the solution.

This one: The server was unable to allocate from the system nonpaged pool because the pool was empty.
There is a fix search MS site or google it.
 
It sounds more like a DoS attack, are you sure you haven't got a virus outbreak there?
 
have you tried running netdiag & dcdiag
 
Ill try to seach for the fix you mention GrimR. Hope there is something I can use.

ADB100 - I dont really know. We're running trend micro, and it scans without finding anything.

GrimR - I havent tried netdiag and dcdiag - keep in mind Im pretty new to this stuff, and I assume the other guy have tried these basic things, but I guess I have to start from scratch with this.

Thank you very much for all the feedback, means a lot to me. :eek:)
 
What I am concerned with is the following;

The server was unable to allocate from the system nonpaged pool because the pool was empty.

Is it possible that you have a pagefile size that is too small (and/or you are running out of memory)? When you run out of paging space, all sorts of OS and application errors tend to occur.
You have indicated your predecessor added Vmware to the servers/one server.. is it possible u don't have the memory to support these environments?
 
itsp1965 I guess its likely.

I believe all servers are running with 2 GB of memory.

I dont know anything about vmware virtual servers, but it seems he made 2 or 3 of them. Likely on the DC server since its the one doing the least. But I'll check it tomorrow when I get back to work.

I dont know how much memory the Vmware servers are using?

I know he removed a lot of things running in the background, and it did help a little. Before the system crashed every day, now it does crashes just about every 3 days.

Most often it seems when the backup is running at night, and then the backup fails as well.
 
After several hours of investigating the logs, I found out that the file server got the 2019:pagefile issue all over.

The log was almost filled with 2019 4 month back, and its obvius that the problem goes away with reboot, and then reappear after 3 days when running the nightly backup, and then the 2019 appears every second until the system hangs, and is rebooted.

So my thought goes to the backup, and the backup exec software. Some search at google showed that I was not the only backup exec user having the same issue.

Does you guys have any experience with backup exec?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top