Weird AIX swap usage

flpgdt · Jan 20, 2011

Hi,

Got this behaviour in one of my servers which I can't quite explain. I'd like some tips.

I've got an alarm that my swap usage was up to 80%, which turned to be a fact when I checked `topas`.

The problems I've found were:

1) Looking into a `vmstat 5 10`, I've found lots of PI's with *0* SR's. How's that even possible?

kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------------------
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec
12 1 12245252 92621 0 229 0 0 0 0 1884 56260 24568 84 15 0 0 2.50 99.9
12 1 12245413 90313 0 190 0 0 0 0 1764 51759 23827 86 14 0 0 2.50 99.9
12 1 12245193 88040 0 218 0 0 0 0 1734 69307 25347 85 15 0 0 2.50 99.9
14 1 12246377 83810 0 157 0 0 0 0 1960 80471 24057 84 16 0 0 2.50 100.0
13 1 12246050 79785 0 183 0 0 0 0 2280 103138 21990 81 19 0 0 2.50 100.0
12 1 12245988 77393 0 173 0 0 0 0 1881 51984 22331 84 16 0 0 2.50 100.0
14 1 12246180 74721 0 179 0 0 0 0 1792 52624 20610 79 21 0 0 2.50 99.9
15 1 12246131 72304 0 176 0 0 0 0 2109 58504 23344 82 18 0 0 2.50 99.9
15 1 12246673 68231 0 187 0 0 0 0 2272 73068 25319 85 15 0 0 2.50 99.9
13 1 12246305 66342 0 172 0 0 0 0 1966 104313 21884 83 17 0 0 2.50 100.0

2) I went off to find who were consuming my precious memory and it I found that I actually know very little in how to figure that in a AIX.

Found this command somewhere which seemed reasonable reading the manual:

`ps -ealf | head -1 ; ps -ealf | sort -rn +9 | head` which seemed reasonable by looking in the manual. and gave an output of the sort:

F S UID PID PPID C PRI NI ADDR SZ STIME TTY TIME CMD
242001 A util 1581080 1 76 60 20 fb34510 150044 10:55:40 pts/0 103:43 /usr...
242001 A util 569540 1 0 60 20 d235510 142580 11:01:09 pts/0 68:55 /usr/...
242001 A util 1425464 1 4 60 20 43c6510 129916 23:17:58 - 168:02 /usr...
202001 A util 245864 1 83 60 24 da9e510 113008 13:37:22 pts/2 43:26 /usr/...
242001 A util 1163370 1 0 68 24 d69d510 103572 09:55:52 pts/13 17:24 /usr/...
242001 A util 466984 1 0 60 20 5d0c510 83064 11:00:34 pts/0 22:57 /usr/...
242001 A raid 1048782 1 7 60 20 e5b8510 78724 16:41:18 pts/6 0:36 /usr/...
242001 A util 659612 1 13 60 20 edc3510 76400 11:13:17 pts/0 10:57 /usr/...
242001 A util 1134736 1 0 60 20 eb91510 75188 06:21:23 - 27:23 /usr/...

Where SZ is supposed to be the size in 1k units, according to the man page; which also didn't make too much sense for these are Java processes with Xms=1G (or so) and as shown the biggest process has ~150mb. Again.. ?

And for the last, my server is not currently at maximum load but still shows ~20% of swap space usage. How do I explain that?

Well.. I'm pretty lost. These things were easier to figure with Solaris

Would someone share some thoughts?

thanks,

f.

mrn · Jan 25, 2011

what does 'topas -P' show?

On initial viewing looks like CPU "user 85 sys 15 0 idle 0 wait.

Take note of the r column on your first output; that is the number of runnable kernel threads that are running or waiting to run. looks high to me.

Probably Filesystem Cache is using your swap space.

From experience if your running Java your leaking memory, but in your case memory doesn't look that bad.

Google for "methods for identifying memory leaks in AIX" for a good doc on mem leaks.

Mike

"Whenever I dwell for any length of time on my own shortcomings, they gradually begin to seem mild, harmless, rather engaging little things, not at all like the staring defects in other people's characters."

chgwhat · Jan 25, 2011

First of all you did not say what kind of server, or how many CPU's, or what version of AIX.
Depending on version the the SZ may be 4K pages which would mean the large processes are 600MB in memory.
If you have 12 CPU's or more the run queue is alright, a 20% swap usage should not be an issue. 20% is the approximate amount most of our servers have, this is normal as AIX will swap out memory that is not used or referenced. You have no PI or PO so I think that you are not memory constrained. Is there any problem reported other then the 20% swap ?

Tony ... aka chgwhat

When in doubt,,, Power out...

blarneyme · Jan 27, 2011

vmo, ioo, schedo, nfso are set to defaults or what? You still have free frames.

Threshold page replacement starts... Threshold page replacement stops...
-------------------------------------------------------------------------------------------------------------------------------------------------------
When the number of free page frames on the free list When the number of free page frames on
reaches minfree. the free list reaches maxfree.
-------------------------------------------------------------------------------------------------------------------------------------------------------
If the number of JFS pages in memory is within minfree When the number of JFS pages is
pages or maxperm and strict_maxperm is 1. (maxperm - maxfree).
-------------------------------------------------------------------------------------------------------------------------------------------------------
If the number of JFS2 or other client pages in memory is When the number of client pages is
within minfree pages of maxclient and (maxclient - maxfree).
strict_maxlcient is 1.
-------------------------------------------------------------------------------------------------------------------------------------------------------
If Workload Manager is used and a WLM class has
reached its memory limit.
-------------------------------------------------------------------------------------------------------------------------------------------------------
Criteria Description
--------------------------------------------------------------------------------------------------------------------------------------------------
A) numperm > maxperm The page can be stolen if it is a file page; If the page is not a file page, it is left in memory.
--------------------------------------------------------------------------------------------------------------------------------------------------
B) numperm < minperm The lrud daemon does not care what type of page it is. This page can be stolen if it is
unreferenced.
--------------------------------------------------------------------------------------------------------------------------------------------------
C) numperm <= maxperm Replacing rates are used to determine if the page can be stolen or replaced
&& numperm >= minperm * If the file repage counter is higher than the computational repage counter, then computational
pages are stolen.
* If computational repage counter is higher than file repage counter, then file pages are stolen.
--------------------------------------------------------------------------------------------------------------------------------------------------
D) lru_file_repage=0 As long as numperm is above minperm, only file pages will be stolen. (Criterion B is ignored.)
---------------------------------------------------------------------------------------------------------------------------------------------------

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Weird AIX swap usage

flpgdt

Technical User

mrn

MIS

chgwhat

MIS

blarneyme

MIS

Similar threads

Part and Inventory Search

Sponsor