fuzzyocelot
Programmer
Hello all!
I’ve been Google’ing and reading up on SQL Server and memory since yesterday and am very much confused. Hopefully someone can help clarify a few things for me.
We have a production cluster (active/passive) running SQL Server 2000 Enterprise SP3a (vs 8.00.818) on Windows Server 2003 Enterprise SP2. There are 2 CPUs per node. We have three instances on the cluster. The main instance has over 140 databases. The other two have around 5 databases each.
The page life expectancy lately has been very low (less than 30 seconds) and is being sustained at that value. It rarely gets above 300 seconds. This is not good. It means that the pages are being removed from the buffer cache too frequently which reduces the value of the cache, right? This means that SQL Server has to perform more disk reads which can degrade performance. Did I get that right so far?
The buffer cache size is 2.51 GB and the hit rate is 99%. The procedure cache size is 6.05 MB and the hit rate is 92.6%.
The server has 8 GB of RAM per node. The max size of the virtual memory is 11.7 GB with 6.11 GB free (52% free). The committed virtual memory is around 5770 MB. I read that anything below 7000 is not good. Is that right?
Pages found in RAM is 100%. The read hit ratio is around 0%. We have AWE enabled and the /PAE and /3GB switches in the boot.ini file. We recently added the /3GB switch and now are wondering if that’s causing problems.
The average queue length for one of the disks lately has been around 2 or so. I’m not sure what that indicates.
There appear to be spikes in soft page faults and hardly any in hard page faults. The highest soft page fault spike is 6889 per second. There seem to be a couple of spikes in page reads around 6 per second. The page file is 4092 MB in size - 25 MB of that is used.
The min server memory for the main instance is 0 and the max is 3584 MB.
The min server memory for the second instance is 0 and the max is 1280 MB.
The min server memory for the third instance is 0 and the max is 1024 MB.
We’ve been seeing CPU utilization sustained between 70 and 100% for the last week or so. We’ve been able to track most of the CPU problem to a single report that someone’s been running. This report normally runs after 5 pm when no one is here because it usually takes over an hour to run. However, recently they’ve been running it during the day which apparently hogs resouces on the server. They’ve since stopped doing this at our request. However, we’re still continuing to see performance problems.
The low page life expectancy has me concerned. Could someone please tell me what I need to do or check next?
I really appreciate it!
Thanks in advance!!!
I’ve been Google’ing and reading up on SQL Server and memory since yesterday and am very much confused. Hopefully someone can help clarify a few things for me.
We have a production cluster (active/passive) running SQL Server 2000 Enterprise SP3a (vs 8.00.818) on Windows Server 2003 Enterprise SP2. There are 2 CPUs per node. We have three instances on the cluster. The main instance has over 140 databases. The other two have around 5 databases each.
The page life expectancy lately has been very low (less than 30 seconds) and is being sustained at that value. It rarely gets above 300 seconds. This is not good. It means that the pages are being removed from the buffer cache too frequently which reduces the value of the cache, right? This means that SQL Server has to perform more disk reads which can degrade performance. Did I get that right so far?
The buffer cache size is 2.51 GB and the hit rate is 99%. The procedure cache size is 6.05 MB and the hit rate is 92.6%.
The server has 8 GB of RAM per node. The max size of the virtual memory is 11.7 GB with 6.11 GB free (52% free). The committed virtual memory is around 5770 MB. I read that anything below 7000 is not good. Is that right?
Pages found in RAM is 100%. The read hit ratio is around 0%. We have AWE enabled and the /PAE and /3GB switches in the boot.ini file. We recently added the /3GB switch and now are wondering if that’s causing problems.
The average queue length for one of the disks lately has been around 2 or so. I’m not sure what that indicates.
There appear to be spikes in soft page faults and hardly any in hard page faults. The highest soft page fault spike is 6889 per second. There seem to be a couple of spikes in page reads around 6 per second. The page file is 4092 MB in size - 25 MB of that is used.
The min server memory for the main instance is 0 and the max is 3584 MB.
The min server memory for the second instance is 0 and the max is 1280 MB.
The min server memory for the third instance is 0 and the max is 1024 MB.
We’ve been seeing CPU utilization sustained between 70 and 100% for the last week or so. We’ve been able to track most of the CPU problem to a single report that someone’s been running. This report normally runs after 5 pm when no one is here because it usually takes over an hour to run. However, recently they’ve been running it during the day which apparently hogs resouces on the server. They’ve since stopped doing this at our request. However, we’re still continuing to see performance problems.
The low page life expectancy has me concerned. Could someone please tell me what I need to do or check next?
I really appreciate it!
Thanks in advance!!!