Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

vmstat output translation 1

Status
Not open for further replies.

khalidaaa

Technical User
Jan 19, 2006
2,323
BH
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
3 2 473425 11516 0 12 16 62 120 0 439 13744 5193 88 12 0 0
4 2 471568 10458 0 10 0 0 0 0 526 18294 6288 85 14 0 0
4 2 476338 1525 0 20 28 107 700 0 578 13356 8045 85 15 0 0
6 2 474212 1974 0 12 43 90 579 0 549 11470 6842 88 12 0 0
6 2 468905 1695 0 24 11 40 58 0 578 15590 6773 87 13 0 0
7 2 475681 126 0 33 57 188 354 0 528 8693 6358 89 11 0 0
5 2 466138 5296 0 16 23 84 143 0 625 11375 2989 92 8 0 0
7 2 470950 118 0 12 50 155 301 0 549 13914 2174 92 8 0 0
11 2 473488 531 0 29 103 326 1547 0 642 14143 6416 85 15 0 0
9 2 473739 2835 0 19 76 180 366 0 572 13667 6911 85 15 0 0
8 2 465827 7977 0 27 70 188 833 0 605 13407 7372 84 16 0 0
9 2 470989 1774 0 24 49 178 1071 0 579 25893 6441 82 18 0 0
8 2 469419 1327 0 27 10 79 168 0 598 23657 7262 82 18 0 0
9 2 467734 1236 0 23 0 78 174 0 591 18229 7633 83 17 0 0
10 2 474225 608 0 34 24 172 441 0 725 18781 8010 82 18 0 0
12 3 473474 267 0 39 52 244 1368 0 758 14813 7345 84 16 0 0
11 2 473359 1378 0 18 47 142 318 0 585 15015 6809 85 15 0 0
10 3 481238 14 0 45 79 240 1504 0 665 15185 2764 91 9 0 0
9 2 476975 1195 0 17 20 72 93 0 496 19177 2278 91 9 0 0
8 2 477660 1159 0 18 24 104 174 0 595 18130 6596 87 13 0 0


Can any body help me in translating this vmstat output?

Its an output from SP 9075-550 machine with 2x375 MHz and 3 GB RAM
 
This machine is indeed overloaded.Constantly 0 pct idle, 0 pct wait, and the applications you are running are killing the cpu's.If you look at the 1ste column ( the r value), these are the threads on the runqueue.normally , this should be about 3-4.If it's more than that , you have a cpu bottleneck.Your problem is certainly not high disk I/O or network I/O.Did you tune your machine ( minfree and maxfree values?)Please post the output of :

vmo -a

rgds,

R.
 
I'm running AIX 4.3 on this machine and the application running on it are oracle EMPAC and SQR reports!!!

RMGBELGIUM I couldn't run the vmo -a :(

see i'm new to this company that i'm currently working for and i'm new in AIX!!! my senior Administrator used to do every thing but he is not saying any thing to me :( he is keeping every thing for him :(

so i don't know if he did any tuning to this machine or not :(
 
The output i get when i run vmo is this

ksh: vmo: not found.

same for the schedo!!!
 
vmo, ioo and schedo are only since AIX 5.2 I believe. On 43 and 51 there were the sample programs /usr/samples/kernel/vmtune and schedtune (same dir).

See if you have that dir. Or if you have sth. like /rc.tune which runs on startup.

One warning though: versions of the sample programs must exactly match the running kernel or you run into all sorts of trouble when trying to modify any setting.
You can however run the samples without any options - will give you the current settings. But if you have the wrong veriosn, the results are probably not valid.


HTH,

p5wizard
 
There isn't much you can do to tune this system at 4.3; other than adding more cpu / memory.

Mike

"A foolproof method for sculpting an elephant: first, get a huge block of marble, then you chip away everything that doesn't look like an elephant."

 
p5wizard Thank you for your comment

i could find /usr/samples/kernel/vmtune on a different server than the one i'm in but when i tried to find it on the server i'm having slowness with i found nothing as indicated below it is not installed there :(

[empacprod]{root}/usr/samples>lslpp -lI bos.adt.samples
lslpp: 0504-132 Fileset bos.adt.samples not installed.

mrn

What you said is interesting :)

in fact my company is in process of shifting these machines to a P5 570 machine but i needed to have something like vmstat to size the P5 570 accordingly!!!

So by mentioning this how much processing power + RAM do you think i should have for an LPAR running this application on the current machine?
 
Take you test / Q/A system and try 2 CPUs and 4 gig RAM. If the CPUs are fast enough, you might not need more.
Before going to production, you should do some load testing, at the data volumes you expect close to the end of the life of the system. If you can't get there now, what happens in production when the data reaches this amount?



BocaBurger
<===========================||////////////////|0
The pen is mightier than the sword, but the sword hurts more!
 
BocaBurger

Thanks for your comment :)

This machine is SP 9075-550 machine with 2x375 MHz and 3 GB RAM

The P5 570 CPU speed is 1.65GMz so wouldn't 1 CPU is enough with 3 GB RAM as i planned?

Regards
Khalid
 
You also might want to check that there's no "runaway" processes on your machine which eat up all the CPU cycles that the application doesn't consume.

type

ps -el|awk '{if ($6 > 10) print}'

to identify processes with a high CPU consumption (experiment a bit with the treshold 10). If there's any processes that shouldn't be there (not ORACLE or SQR), investigate that process further. See if it disappears when the app is stopped (that is, if you can stop the app).

If it's only your APPs that are consuming the CPU, go ahead with your tests on the 570 with one CPU, but I'd up the memory to 4GB and see where that gets you.

In any case, for an ORACLE server that stresses your CPU that hard, I would think that some % should be iowait... You *always* have enough processes ready that have no IO whatsoever to wait for. Somehow doesn't sound right.


HTH,

p5wizard
 
Thank p5wizard for the nice information :)

i run the command you gave me and this is the output:

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
240001 A 105 47336 1 13 70 22 561d5 28216 - 0:07 oracle
240001 A 105 55174 1 57 94 22 2a42a 36676 - 3:31 oracle
240001 A 105 67514 1 29 79 22 685da 41596 - 0:09 oracle
240001 A 105 72536 1 41 85 22 261a9 28932 - 0:20 oracle
240001 A 105 72852 1 111 122 22 56435 27416 - 961:37 oracle


as you can see all those processes are oracle!!! so i don't see any other process that is unknown?!?

but any way, as mrn said :) this machine need to be thrown anway :)

but now i'm having a problem with the application on that machine!!! the application is currently running on AIX 4.3 and it is not certified to run under any further releases of AIX!!! and i guess that the P5 570 can run only AIX 5.2 and 5.3 so i won't be able to shift this application as an LPAR in there!!!

any way, Thanks guys for the help :)

Regards
Khalid
 
IMHO it's not right that one oracle process should be consuming that much CPU compared to the other oracle processes. Try and find out what is causing this. I guess you need to talk to the oracle guys...


HTH,

p5wizard
 
It could very well be that one of these oracle processes is a connection to an oracle database, with status inactive in the database, or even non-existent.We see that now and then on our DB's, a proces in the DB isn't visible anymore, but on AIX it is still listed, consuming lots of CPU, doing nothing.


rgds,

R.
 
p5wizard,

what colume of the ps command is the 10 relating too? is it the 6th colume "C" or "PRI" colume plus what does those columes actuall mean? When I run that command on our primary server I get the below

--> ps -el|awk '{if ($6 > 10) print}'
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
40903 A 200 240394 525482 14 60 -- 1f68ef 10272 34b06898 - 471:38 oninit
40903 A 200 342970 1 18 60 -- 1f3d6d 10340 3458da98 - 460:42 oninit
40903 A 200 376870 525482 39 60 -- 14f94f 10320 3474ccac - 426:39 oninit
40903 A 200 543492 525482 19 60 -- a7b34 10272 34cf2718 - 529:03 oninit
40903 A 200 712914 525482 25 60 -- 3c181 10280 34acf198 - 646:54 oninit
40903 A 200 1067026 525482 18 60 -- 66c9c 10292 34732698 - 424:42 oninit
40903 A 200 1159464 525482 18 60 -- adda5 10276 34acf458 - 476:38 oninit
240001 A 950 1176840 879982 46 88 22 73900 12880 - 0:00 fuser
 
Column 'C' is an indication of CPU utilization for a given process.

see man page:
C
(-f, l, and -l flags) CPU utilization of process or thread, incremented each time the system clock ticks and the process or thread is found to be running. The value is decayed by the scheduler by dividing it by 2 once per second. For the sched_other policy, CPU utilization is used in determining process scheduling priority. Large values indicate a CPU intensive process and result in lower process priority whereas small values indicate an I/O intensive process and result in a more favorable priority.

HTH,

p5wizard
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top