Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SERVER FREEZING URGENT HELP NEEDED. 1

Status
Not open for further replies.

eastwood357

Technical User
May 19, 2003
27
GB
Hi all.

I have 6 dual PIII 900mhz IBM x225 servers with XPA loaded.
The servers are all windows 2000 terminal servers with SP3 loaded. They are running FR2/SP2, cant get FR3/SP3 yet as the ftp is having problems and our download speed in South Africa is just not good enough. (Here anyone with a 256k line is considered awesome)

The problem is our Servers are freezing up at least 6 times a day, at random intervals. This can happen without warning.

The symptoms that one sees are

1) No new user can log in.
2)Administrator cant log into the affected server.
3)You can remove the Network Cable for 10 seconds and plug it back in and the server comes back online. Then the users have to reconnect to the disconnected sessions.

We have tried most hotfixes. removed Load Balancing, and moved over to advanced load balancing with out any success, i have looked through all logs on event viewer when the server hangs. Nothing out of the ordinary.

I know that i am not the only Citrix admin to be having these problems, there at least 20 other admins out there with the same problem. This means to me that it must be an Citrix issue.

I am also using an Access database.

PLEASE IF ANYONE HAS ANY IDEAS OR SOLOUTIONS LET ME KNOW, CITRIX IS NOT HELPING, THEY MUST THINK THAT SINCE IT IS ONLY A FEW SITES THAT IT IS NOT WORTH THERE TIME AS IT MIGHT BE ONLY 2% OF THERE MARKET.

I HAVE 400+ USERS THAT ARE REALLY GETTTING FED UP WITH THE SERVERS.

SO PLEASE POST ANYTHING HELPFUL TIPS THAT YOU CAN THINK OF.

Regards

EASTWOOD.




 
By freezing up - are users who are connected ok or do they lose connection? Can you still ping the server? If so, can you log onto the server using the standard microsoft terminal server client rather than the Citrix client? If so, then perhaps it's covered by my faq48-3647. In particular:

6. If you get a "Cannot connect to Citrix server: The Citrix Server you have selected cannot be located" for local ICA clients, and a "The MetaFrame server farm reported an unspecified error." for NFuse clients go to the server and open a command prompt. Now, type "query farm /app" to see what the load is like on the server. If you find that it's maxed out on one or more of the published apps/desktops then that's the problem - either the server is overloaded OR the load evaluator you are using on one/more of the apps/desktops is too low. I have recently found an issue with XPFR2 where one of the settings in the Advanced load evaluator seems to increment whether or not people are using the server - this ends up with the server no longer responding to client requests until - eventually - the server decides everything is OK again. This may take hours or days to do, and a reboot may not clear the counters either. Then the server may be fine again for a week or month or more until without warning it seems to die again but without any messages in the error logs etc. The fix? Detach the evaluator from the server/app. Do the "query farm /app" again and check it's OK. FYI I believe that the default load evaluator does not have this problem.

Cheers
 
Are your users connecting via NFuse? There is a May 7 Hotfix (XE102W065)...Issue 45: "The XML Service could go into an endless loop, causing an NFuse Web page to hang during enumeration of applications. As a result, users could not enumerate published applications or connect to published applications."

This occured in my environment after upgrading to FR2. Active users were able to continue working, however, launching additional sessions was not possible.
 
I am having the same issues here. We are not having as bad as you are but I have noticed that our servers actually blue-screen after a while of being locked up.
Eventviewer showed that it was blue-screening with a stop 7f so I called them an grabbed a hotfix from them. It's helped out a little bit but everyonce in a while it will still go down.
Here is the QArticle for Microsoft. (Q814789)

Hope it helps out.
 
Hi Again.

Thanks for your replies.

Firstly, Beergood, unfortunatly it is not that easy, the servers lock up, and you cant access the servers at all, not via any client, Citrix or Terminal client.

Secondly, i think i might have found the source of the problem, reading through the DR. waston log i see one of our apps keeps crashing. (SOME HOMEGROWN PIECE OF JUNK, NEEDED BY LOCAL USERS TO DO THERE WORK.)

Mitchero, sorry no NFUSE installed.

Enigma, thanks for your tip, i will be investigating it, maybe you should check out your drwtsn32.log, and see if you can see any application exception errors, this could point you in the right direction.

I hope i solve this Citrix problem fast. Everytime a server hangs 50 users want to kill me, and patience is running thin. they actually are calling CITRIX SHITRIX NOW.

Also my databse is ACCESS driven, is your also access Enigma? trying to see if all the servers that ate hanging is with an access DB.
 
Hmmm..... so the server is completely frozen (even at the server console itself?), but pulling the network cable out of the server fixes it..... Hmmm....

I'd suggest it's back to basics:

1. You indicate it's happening to more than one of your servers. If so, then are they running exactly the same hardware? If so, are they running exactly the same software - are you using an image to create the servers? If you are, then make sure that you don't have any issues with the images. (For example, make sure that the sids are different. Also if you're using OfficeXP make sure that the servers don't have the same license on them. Either of these will cause similar issues to what you're describing).

2. Fire up task manager on the server(s) and check to make sure that there's nothing there that's hogging memory/resources. With 50 users per server, make sure you aren't running out of memory. Leave this running on the server console so you can see it as soon as there's a problem (obviously disable screen savers, etc - speaking of which, make sure if you are running a screen saver that it's not an opengl saver but just a blanker, oh and make sure that the users themselves can't run screensavers).

3. How many PCI cards do you have in the server? Check for clashes with anything else there - pay particular attention to the network card(s).

4. Do you have scsi drives and a scsi tape drive - if so, make sure they're on different channels and ideally different cards.

5. Make sure you are running the latest version of drivers for both you motherboard/server and other hardware - in particular the nic(s).

6. Make sure that there isn't an issue between the server and your network switch - if you're running in full duplex mode, it might be worthwhile dropping down to half duplex to see if that helps.

7. Are your servers running fixed ip or dhcp? If dhcp, consider moving to fixed.

That should keep you busy for a while... ::)

Cheers & good luck
 
Oh, and also consider dropping down to SP2 on one or more of the servers - I've heard of some issues with SP3, specifically when running IPX as well as IP but who knows if there's anything else in there......

Cheers

[noevil]
 
If the servers are completely locking up, and inaccessible by other means, then you've effectively ruled the Citrix software out of the equation. It won't look like that to your users, but I've worked with a few users...

On the off-chance that it is somehow related to MetaFrame, try rebuilding the LHC on each server, using dsmaint recreatelhc.

The problem lies either with your server build or with the network. You say you have a "homegrown app" that keeps crashing - is that on every server? If not, try removing it from one server and see if that server continues to lock up.

Is there anything in the Event Log that might give a clue - anything prior to lock-ups? Try monitoring CPU, Memory Usage and Paging File usage using Perfmon to get a rough idea of how your resources are managing. If there's a lot of paging, you may need more RAM, even if memory usage shows OK. Other tools, such as Lakeside's Systrack may help paint a clearer picture.

Since everything seems to go back to normal when the NIC is unplugged, the network falls under some suspicion;

Incorrect entries in WINS and DNS (client and server) can cause similar issues - especially WINS. I solved a similar problem recently where a client had duplicate entries. It's unlikely you have the same issue, but check with whoever manages your WINS and DNS servers that the correct records (and reverse look-up records) exist, and that the clients are pointing to the correct servers.

Next check your NIC settings. If they're set to auto everything (most are), then manually set the speed to 100Mb/sec (or whatever's appropriate for your LAN), and Half-Duplex. I've seen issues with older Intel NICs where lockups can be caused when they are set to Full-Duplex. Check any managed switches and/or hubs local to the MetaFrame servers are set to match.

Lastly, if you have any Windows 95/8 clients, ensure they are not set to become Browse Master (Network Settings). I saw this just once at a large company with multiple sites, where a Windows 95 PC was acting as Browse Master on a remote site. This slowed the entire network down, but hit the Citrix servers hardest.

Hope this is helpful

CitrixEngineer@yahoo.co.uk
 
eastwood357,
The servers that are having this issue are "desktop" which have word, excel, outlook, pp, sap gui..blah blah blah.(normal desktop stuff) We do have an "MS Access" server that did have the issue with the stop 7f but after installing the hotfix it stopped.

According to Microsoft, there is a known issue if you have Win2k, WinXP w/terminal services on it and not being "responsive".

Also, maybe that app is just taking up all the system resources which causes the server to hang. Maybe you should try a demo of performance manager from appsense. This software keeps your server more stable because it clamps down on processes that take control of your system. (works pretty good but not to cheap)

I've also heard good things about RTO. (I haven't tried it myself.)
 
I'm experiencing this issue now as well. I have updated the network teaming utility on our DL360's because I figured it would have something to do with the network...but that didnt fix it.

I double-check our switches and speed/duplex but everything is ok.
The thing is that the fix for it is just to unplug the network cable!

I got our vendor contact Citrix and they were suppose to call me soon.

Hopefully I can post something that will resolve this issue.
 
HI Enigma99

GOOD NEWS I HAVE MANAGED TO FIX OUR SYSTEM NO MORE HANGING. BEEN UP FOR 4 DAYS.

After reading lots of forums, and microsoft websites, we found the neccessary patches needed.

Good news is that is not a citrix problem but a microsoft problem. Too long to go into detail, but 3 MS patches needed that are not availabe to the public.

Regards
EASTWOOD
 
Eastwood!!! MY MAN!
I'm glad you found a fix for it.

It's been a real pain in the butt trying to get this thing fixed. I was getting to the point of just calling Microsoft.

If you can, please post those hotfixes!

Thanks!!!
 
Please post info on the patches.

we need this info bad.

at least just the q#'s

 
Hi Again.

Well the patches needed are Q810038 then Q329405 and finally Q328020. These fixed our 6 servers and have been stable for 7 days now.

I still cant believe our servers are stable, you know when you fight the problem for 6 months and then all of a sudden it comes right. Well i suppose anyone who has not been fighting this problem would not understand.

Regards

Eastwood.
 
thank you for such a quick responce.

i have only been fighting this problem for 1 month, but i understand what you mean.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top