Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Performance problems of a p690 LPAR

Status
Not open for further replies.
Dec 3, 2003
37
US
Hi,

I need Urgent help in resolving performance issues on a LPAR (pSeries p690) running AIX 5L ver 5.2, ML4, JFS2 filesystems and DB2.

This LPAR has 20 processors; 54 GB Memory and consistently it is RUNNING SLOW for the past few months.

CPU Type: 64-bit
Kernel Type: 64-bit
Memory Size: 55296 MB
Processor Clock Speed: 1100 MHz

# vmstat -v
14155776 memory pages
13645476 lruable pages
42967 free pages
4 memory pools
857065 pinned pages
80.1 maxpin percentage
3.0 minperm percentage
10.0 maxperm percentage
37.0 numperm percentage
5056149 file pages
0.0 compressed percentage
0 compressed pages
37.0 numclient percentage
10.0 maxclient percentage
5053589 client pages
0 remote pageouts scheduled
237 pending disk I/Os blocked with no pbuf
0 paging space I/Os blocked with no psbuf
27442 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
13160 external pager filesystem I/Os blocked with no fsbuf

And...

# vmstat 5
System Configuration: lcpu=20 mem=55296MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
7 21 9048650 22517 0 0 0 4167 9888 0 8990 68283 41520 17 13 20 49
12 47 9048720 33556 0 0 0 23692 38067 0 15020 218167 138203 41 19 0 39
16 48 9049104 33959 0 0 0 19885 35170 0 14733 176964 134771 46 20 0 33
12 48 9048896 19801 0 0 0 18607 27337 0 14722 140453 144841 40 21 1 39
16 49 9048945 26710 0 0 0 22837 34880 0 15164 217362 137209 43 20 1 37
14 49 9049036 39515 0 0 0 23348 37338 0 14505 142268 137738 42 20 0 38
16 43 9049113 29740 0 0 0 19641 26499 0 14588 133223 138848 42 19 0 38
14 51 9049812 24093 0 0 0 19282 32294 0 14746 179873 139821 39 21 1 39
8 54 9049365 34704 0 0 0 22309 29113 0 15491 223155 139550 40 19 1 40
11 48 9049466 26799 0 0 0 17192 27072 0 15425 261954 140777 39 20 1 41
8 42 9049532 38821 0 0 0 11794 20446 0 7481 128557 123959 23 23 1 53
9 47 9049636 19601 0 0 0 14265 25433 0 14786 139700 140813 37 18 1 44
15 49 9050206 22118 0 0 0 20124 40855 0 15384 142294 140295 38 20 1 42
14 50 9049851 19408 0 0 0 18509 43938 0 15868 221585 141426 40 19 1 41
11 49 9049984 30651 0 0 0 24060 45054 0 14616 180971 139272 39 19 1 41
13 52 9050086 32471 0 0 0 23127 32627 0 14488 142934 139406 37 20 1 42
9 48 9050319 19615 0 0 0 20013 28600 0 14762 222597 139666 39 20 1 41
11 51 9050352 33204 0 0 0 24616 32932 0 13888 143170 139895 40 20 1 40
^C#

Please advise me to make this system perform better.

Thanks.

Riaz Ahamed
 
Hi

Ok, i saw your vmo/ioo Setting and they are looking good.

Here my input.

Your Machine is waiting on I/O.
Check your Ranks and your DB Layout.
Split Paging, DB, Indexes, Logs onto diffenrent Ranks.
Collect data over one day to see if this is over a long time.
How does your Shark and FC-Switches Statistics looking ?
I think you need to check also for updates.


Kind Regards.

 
One thing that seems to have been over-looked. In my experience poor performance can be caused by bad programming. I'm not a db2 person (Oracle) and I've seen badly written SQL statements bring a system to it's knees.

It may be worth back-tracking to when your performance problems started & checking the Changes that have been made to the code.

I've found the following to be a valuable tool when diagnosing problems.


Regards

Mike

"A foolproof method for sculpting an elephant: first, get a huge block of marble, then you chip away everything that doesn't look like an elephant."

 
Ok, these are my thoughts on AIO:

If all of the servers consume about the same amount of CPU time, then your IO load is probably higher than the aio servers that are available can accomodate and your app WAITs (not for the ESS, but for the aio servers)

If some of the servers lag behind on CPU time consumption, then you have reached a max-load at some point in time and and your config is OK.

You are at 400 i.e. 20 CPUs x 20 max servers per CPU. You more than likely have not enough aioservers to go around...

You have enough CPU power and memory, so I would increase the number of aioservers (you need to reboot for the change to become effective).

See also AIX5L's suggestion:

min=64
max=256
maxrequests=15872

right after boot time, you will count #CPUs x 64
if after time, the app needs more, they will increase up to #CPUs x 256. If you reach this number fairly quickly, then max=256 is probably still too low (not likely)

I'm not saying that this is THE solution, but imho your AIO setup is part of your problem.

You might want to follow other suggestions like MRN's tip about DB2. Any DB server, however powerful, however big can be brought to its knees by badly written SQL statements...

Also: have you checked out that SDD flash I talked about? I don't know if it is valid in your case, but it might be:
SDD/ESS, DB2, perf degradation (sound familiar?)




HTH,

p5wizard
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top