Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

uptime - number of runable processes

Status
Not open for further replies.

jprabaker

Technical User
May 31, 2001
185
GB
Hi,

According to the man pages the load average section of the uptime output shows the "number of runable processes". On most of my systems this reports something like 0.9, 0.15 and 0.26. Can anyone explain what these figures mean and how do find out what the average load of the system is from them?

regards,

John
 
Taken straight from the man pages...


The load average is the number of runnable processes over the preceding 5-, 10-, 15-minute intervals. This equates to how busy the system is in those intervals. I would suggest that it was the average that a program had a CPU available to run any given instruction.

Bill.
 
jprabaker...

I'm not 100% sure what the figures in the 'uptime' result are, but if its system load you want to monitor, have you tried 'sar' ? (System Activity Report)

If not, you simply 'unhash' the relevant lines in the 'adm' crontab, and the system will take snapshots of the current system loading and average them out. Each day a new file is written (don't worry, these arent big, and they get written over each month), and they are called sa'dd' (where dd is day of the current month), and are stored in /var/adm/sa.

To view the file, simply type sar -f <filename>

Here is a snippet from one of my files....

08:00:01 %usr %sys %wio %idle
08:20:01 2 3 4 91
08:40:01 4 5 5 86
09:00:00 5 6 5 84
09:20:00 9 10 16 65
09:40:00 12 13 5 71
10:00:00 15 18 3 63

Here you can see the load put on the system by 1. Users, 2. System, 3.I/O Operations, and 4. the % of time the system has been idle. As you can see, this system of mine doesn't seem too busy today.

If you need more information on 'sar' check your man pages.. There's loads of info on it in there..

Regards...
 
We are running &quot;big brother&quot; to monitor our machines and it uses the uptime command to gather CPU statics which is then graphically displayed through a web browser.

My problem is that it reports the load average over time to be between 0.5 and 1.8. From this information I would like to be able to calculate an average of the total system resource being used. This would aid in future capacity planning.

Since we have had big brother running on our machines for a couple of months now we have accumulated a resonable ammount of historical data. It seems pointless not to use this information, its just a case of understanding what these figures mean....


Regards

John
 
I may be wrong, but, my interpretation of load average has always been that its the average load on the system. It is reached by a colplex algorithm which tries to show system load as a meaningful number rather than a lot of difficult statistics.

I have always used the premise that if the load average is under 1 then the system is lightly loaded. Figures between 1 & 3 indicate the system is working hard. I have seen figures of 25 where there is clearly something wrong with the system and it is usually grinding to a halt (no logins, no response to simple commands etc)

I can see that bjverzal's answer shows the man page for uptime and explains it as the number of runable processes.

So are we saying here that my 8 processor AIX system with 92 active users only ever has 1 runnable process ?

02:58PM up 59 days, 15:54, 92 users, load average: 1.05, 1.00, 1.12

I'm not convinced!

Alex
 
uptime command uses a load average that is dirived from internal counter, which also is used in determining the run queue stats in the vmstat command.

Following explanation of vmstat should clear some of the doubts.

The r column indicates the average number of kernel threads on the run queue (run-able threads) at one-second intervals. This field indicates the number of &quot;run-able&quot; threads. The system counts the number of ready-to-run threads once per second and adds that number to an internal counter. vmstat then subtracts the initial value of this counter from the end value and divides the result by the number of seconds in the measurement interval. This value is typically less than five with a stable workload. If this value increases rapidly, look for an application problem. If there are many threads (especially CPU-intensive ones) competing for the CPU resource, it is quite possible they will be scheduled in round-robin fashion. If each one executes for a complete or partial time slice, the number of &quot;run-able&quot; threads could easily exceed 100.

So if your graph has a sharp increase in the slope it should trigger some alarm for you to see which application(s) is/are causing the mischief.
 
I have been told by many veteran SA's that your load average should be less than the # of processors in the box. So a 4-way box should have a load average of 4 or less. An 8 way box should have 8 or less. Furthermore, the 3 numbers are representing the run-sz queue at a 5, 10, and 15 minute interval. As I understand this, the run-sz queue is the number of processes waiting to be served. So for the above 8-way AIX example, there is 1 process waiting to be served. I would view this as very acceptable, if not good.
I have seen a system brought to its knees, because the run-sz queue was about 40, however the processors were never over 60% busy. The shear volume of requests killed the performance, even though the processors seemingly were not pegged. Upgrading the box from a 2-way to a 4-way corrected this issue almost instantly.

crowe
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top