Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Event ID 6006 & 6005 "GPClient is taking a long time" 1

Status
Not open for further replies.

EM64T

IS-IT--Management
Jan 20, 2003
37
0
0
US
I'm having issues with my main Windows 2008 Server x64 (ADS Domain and DNS running from VMware ESXi 3.5) that are proving very difficult for me to track down.

Symptoms:

- Server boots with Event ID 6006 and 6005 from source Winlogon. "The winlogon notification subscriber <GPClient> took 91 second(s) to handle the notification event (CreateSession)."
- Workstations connecting to the domain are also getting event log events pertaining to GPclient.
- Server takes around 3 1/2 minutes to boot.
- As I watch my secondary DNS server I get spikes in requests there.
- Users loose connection to printers listed in the ADS, or are unable to add them.
- Only some computers in the building are effected by lost connections, and only some users (some users can log in to the same computer and have different results).

What I have tried:

- Disabling IP6.
- Installing all latest Windows Updates.
- Checked core and ancillary switches in the building.
- Removing all group policies and starting fresh. Removed any extra settings and log on/log off scripts.
- Reverting all Windows Updates (going back to a prior snapshot taken of the VM).
- Disabling A/V software.
- Updating VMware Tools.

I *think* this might be a DNS issue, but I'm unsure as to how to narrow it down. I've run nslookup and everything appears to be pointing in the right direction. No event log items on the DNS side of things. Pings to and from the server work fine.

I was having zero problems with this setup until later this summer when we had several new network printers installed. I did find a few of the printers sharing IPs with other devices on the network, however I believe that was cleared up but the issues still remain.

I would appreciate any ideas you guys might have in narrowing this issue down, I'm very perplexed. I've also had enough of the shotgun approach to solving this issue as it is getting me nowhere.

Thanks!

Chris
 
I'm having this exact same issue, nearly word for word, but I'm using ESXi 4.0 and it is my Domain Controller that is having the issue. I had a @#($ storm yesterday when both servers failed because of these events. And now I'm at a place where they are up but I don't trust them for me to even test anything. I'd really like to know how you fixed this issue.

 
mynameisgunnar, there were two things I ended up doing.

The first thing was apparently ESXi 3.5 has issues with more than one processor configured for a VM. I edited the configured on the Windows 2008 Server to run with just once processor, and it made a noticeable difference in stability AND boot time (as it in knocked a few minutes off).

For the GPCLIENT error messages it ended up being a DNS issue. My main server runs Windows DNS (the 2008 Domain/ADS server). My secondary DNS server (I run SimpleDNS) was incorrectly configured, meaning it wasn't getting updates from the primary server and acting as a stand-alone. Once I configured the secondary properly the change on the network was instant. Printers were working, logon scripts ran, logins were quick, and the GPClient Event Log messages have completely disappeared.

I'm still not exactly sure what the issue was but I wonder if the Windows Primary DNS was getting tied up with requests from other machines and requests to the secondary weren't getting the machines anywhere.

On a side note, I also had a massive Conficker infection to deal with (40+ workstations infected). I'm not sure if that caused the DNS issues with all the requests the worm was making, but it's another possibility.

I hope some of above helps you. Let us know if it does. Good luck!
 
The first thing was apparently ESXi 3.5 has issues with more than one processor configured for a VM.

It's not that it has issues with more than one vCPU being assigned, it's that in the virtual world thread scheduling/dispatch isn't handled the way most people think it works. If you have a VM with a thread that needs scheduled for execution, you would expect that the OS + VMware (or Hyper-V, since it does the same thing) would just grab one CPU and execute it there. But it doesn't, because the thread scheduler inside the OS would barf if it was running on a multi-CPU system but only saw one CPU for scheduling. So the hypervisor doesn't provide CPU resources to a dual vCPU VM until it can provide two CPUs to that VM, even if the VM only needs one. On a heavily loaded ESX host it is much easier to get one free CPU than it is to get two, so the VM spends more time waiting for free resources. You can see this by running ESXTOP from the service console and monitoring the %READY and %WAIT statistics for your VMs. If your VM has four vCPUs assigned then it's even more problematic because it has to wait for four processors to be available. ESX tries to mitigate this problem by upping the priority weighting for VMs with multiple vCPUs, and it helps but doesn't eliminate the issue.

Best practices for deploying VMs are to make all of them single vCPU unless:

a. They absolutely need more than one CPU of processor capacity, or

b. The ESX host that you are running on will not be oversubscribed with regards to CPU.

Generally speaking, if your application supports distributed servers you will get better overall performance running two VMs with two vCPUs each than a single VM with 4 vCPUs. The advantage is a little smaller when we're talking about 2 single vCPU VMs versus one dual vCPU VM, and it is heavily dependent on the workload that the ESX host is running.

________________________________________
CompTIA A+, Network+, Server+, Security+
MCTS:Windows 7
MCTS:Hyper-V
MCTS:System Center Virtual Machine Manager
MCSE:Security 2003
MCITP:Enterprise Administrator
 
One thing I'm curious about though, my ESXi hypervisor was not even close to being loaded. I have 8 physical CPUs for it to work with, and three VMs configured on the machine (this problem VM had two, the other two VMs had one assigned CPU each). I would think that the hypervisor would have plenty of waiting CPUs. I might be wrong though, I'm new to VMs and haven't really delved into how hypervisors work.

Anyway all I specifically remember is that there was a post on the VMware support forums about an issue with multiple CPUs being assigned to a VM in ESXi 3.5 which was supposedly corrected in 4.0. I did as suggested in that thread and dropped the number of CPUs to 1 for the problem VM. Worked like down-town.

Anyway per your post I will now be configuring VMs with one CPU as default. Thanks!
 
I can back up what kmcferrin is saying, this is something that VMware brings up in training often, the less CPUs the better, almost all of my VMs are 1 CPU.

I think you are right about the 6005-6 issue bing DNS but I noticed when I dug more that both this event ID and the DNS issues that caused this event happened after the freezing of the 2008 server occured. Something had to happen eariler to cause the freezing, the freezing then in turn caused all sorts of other issues, but I'm still trying to get to the root of the issue.
 
Do/did you run ESET NOD32 on that server per chance?

I believe an early version of 4.0 had problems with Windows 2008 Server. I don't know the specifics but I had to revert back to NOD32 3 for awhile. It appears it has been fixed now.
 
Are you f'n kidding me! YES. We were running version 2.7 though. The problem went away when we removed it, but I didn't want to say this because I wanted to know of any other way that was used to fix this (other than removing my AV).
 
If you can, run a snapshot/backup and try the latest version out. I've attached a screenshot of the version I'm running for reference.

vM2JkNg


I've been running this for about a month now without any freezing issues.

Another thing to check with ESET is to make sure it isn't constantly scanning mapped drives. I also had this problem when I used Jungle Disk, it was pegging our internet connection (and EXPLORER.EXE) by scanning it all the time.
 
Right now my exclusion list is empty and set to scan everything (note I don't have any mapped drives on the controller at the moment).
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top