Hdisk 100 % Busy

rehaanali · Aug 7, 2008

Hi All,
I have 2 servers showing the similar problem.
On first one hdisk7 is showing 100% busy almost all the time
Some info on first server (i.e Hdisk7)
Topas o/p
Topas Monitor for host: slv8002 EVENTS/QUEUES FILE/TTY
Thu Aug 7 13:46:30 2008 Interval: 2 Cswitch 16134 Readch 0.2G
Syscall 199.7K Writech1018.3K
Kernel 19.4 |###### | Reads 36239 Rawin 275
User 71.8 |##################### | Writes 2749 Ttyout 27624
Wait 6.2 |## | Forks 179 Igets 0
Idle 2.6 |# | Execs 184 Namei 9719
Physc = 4.48 %Entc= 149.3 Runqueue 3.5 Dirblk 0
Waitqueue 21.0
Network KBPS I-Pack O-Pack KB-In KB-Out
lo0 1287.5 1434.1 1434.1 643.7 643.7 PAGING MEMORY
en2 721.3 1343.6 1552.6 393.1 328.2 Faults 33399 Real,MB 10240
Steals 2061 % Comp 80.8
Disk Busy% KBPS TPS KB-Read KB-Writ PgspIn 374 % Noncomp 19.6
hdisk7 100.0 7171.0 872.9 7002.9 168.1 PgspOut 0 % Client 19.6
hdisk8 88.0 2395.0 250.6 2357.0 38.0 PageIn 2822
hdisk0 85.0 1660.7 404.7 1518.6 142.1 PageOut 201 PAGING SPACE
hdisk4 15.0 786.3 89.0 304.1 482.2 Sios 3023 Size,MB 20480
hdisk6 1.0 136.1 34.0 0.0 136.1 % Used 38.8
hdisk3 0.5 0.3 0.5 0.3 0.0 NFS (calls/sec) % Free 61.1
hdisk9 0.0 8.0 1.0 0.0 8.0 ServerV2 0
hdisk2 0.0 0.3 0.5 0.3 0.0 ClientV2 0 Press:
hdisk1 0.0 12.0 1.5 0.0 12.0 ServerV3 0 "h" for help
hdisk5 0.0 0.0 0.0 0.0 0.0 ClientV3 0 "q" to quit

vmstat o/p
vmstat

System configuration: lcpu=12 mem=10240MB ent=3.00

kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------------------
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec
5 3 3746578 10154 0 12 17 74 42 0 245 177139 6567 72 19 8 1 2.96 98.6

vmstat o/p
vmstat -v
2621440 memory pages
2465489 lruable pages
10080 free pages
4 memory pools
737830 pinned pages
80.0 maxpin percentage
20.0 minperm percentage
80.0 maxperm percentage
19.8 numperm percentage
488372 file pages
0.0 compressed percentage
0 compressed pages
19.8 numclient percentage
80.0 maxclient percentage
488372 client pages
0 remote pageouts scheduled
9474970 pending disk I/Os blocked with no pbuf
43736733 paging space I/Os blocked with no psbuf
2740 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
3744284 external pager filesystem I/Os blocked with no fsbuf
0 Virtualized Partition Memory Page Faults
0.00 Time resolving virtualized partition memory page faults

lspv
hdisk7 00c47cbf8aab5a6e abcvg concurrent

second server:
On this server hdisk0 and hdisk1 are showing 100% busy almost all the time

Some info on this server:
Topas o/p
Topas Monitor for host: cov7202 EVENTS/QUEUES FILE/TTY
Thu Aug 7 13:48:28 2008 Interval: 2 Cswitch 6311 Readch 0.1G
Syscall 43432 Writech 0.1G
Kernel 17.9 |###### | Reads 2974 Rawin 7
User 56.6 |################ | Writes 9223 Ttyout 1694
Wait 20.6 |###### | Forks 50 Igets 0
Idle 4.8 |## | Execs 47 Namei 2360
Runqueue 5.5 Dirblk 0
Network KBPS I-Pack O-Pack KB-In KB-Out Waitqueue 8.0
en2 9.9 37.5 44.5 4.1 5.8
lo0 2.0 11.0 11.0 1.0 1.0 PAGING MEMORY
Faults 8269 Real,MB 7424
Disk Busy% KBPS TPS KB-Read KB-Writ Steals 28651 % Comp 88.5
hdisk0 100.0 1444.0 204.0 1232.0 212.0 PgspIn 20 % Noncomp 12.1
hdisk1 100.0 1491.8 215.5 1280.0 211.8 PgspOut 0 % Client 12.1
hdisk10 100.0 59.8K 551.5 404.0 59.4K PageIn 13015
hdisk3 49.0 44.9K 91.5 44.9K 0.0 PageOut 15362 PAGING SPACE
hdisk6 8.0 658.0 74.0 624.0 34.0 Sios 28311 Size,MB 21504
hdisk11 1.5 958.0 193.5 948.0 10.0 % Used 45.4
hdisk5 0.0 0.0 0.0 0.0 0.0 NFS (calls/sec) % Free 54.5
dac1 0.0 1620.0 268.5 1572.0 48.0 ServerV2 0
hdisk7 0.0 0.0 0.0 0.0 0.0 ClientV2 0 Press:
hdisk2 0.0 0.0 0.0 0.0 0.0 ServerV3 0 "h" for help
hdisk8 0.0 4.0 1.0 0.0 4.0 ClientV3 0 "q" to quit
hdisk13 0.0 0.0 0.0 0.0 0.0
hdisk9 0.0 0.0 0.0 0.0 0.0
dac0 0.0 104.7K 643.0 45.3K 59.4K
hdisk4 0.0 0.0 0.0 0.0 0.0
hdisk12 0.0 0.0 0.0 0.0 0.0

vmstat o/p
vmstat

System Configuration: lcpu=8 mem=7424MB

kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
5 2 3908300 5079 0 2 3 185 66 0 216 56145 5443 46 4 49 1

vmstat -v o/p
vmstat -v
1900544 memory pages
1826025 lruable pages
5137 free pages
5 memory pools
530543 pinned pages
80.0 maxpin percentage
3.0 minperm percentage
90.0 maxperm percentage
11.8 numperm percentage
216534 file pages
0.0 compressed percentage
0 compressed pages
11.8 numclient percentage
90.0 maxclient percentage
216534 client pages
0 remote pageouts scheduled
1113932 pending disk I/Os blocked with no pbuf
31562198 paging space I/Os blocked with no psbuf
2740 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
10020930 external pager filesystem I/Os blocked with no fsbuf
0 Virtualized Partition Memory Page Faults
0.00 Time resolving virtualized partition memory page faults

lspv
hdisk0 00ca111e83692a45 rootvg active
hdisk1 00ca655e8380c4d6 rootvg active

Any help/suggestions/comments are appreciated,

Thanks,
Rehaan.

THOR01 · Aug 7, 2008

Is this on a SAN?

rehaanali · Aug 7, 2008

Hi,

Yes,hdisk7 is on SAN

Thanks,
Rehaan.

khalidaaa · Aug 7, 2008

Hi Rhaan,

Disks go 100% utilization due to many things! First of all What kind of application running on each server? have you traced the processes that are creating this load? Could you please show the output of (vmstat 1 10) on each? Could you list your LTG size? Have alook into this to determine your LTG size:

http://www.ibm.com/developerworks/aix/library/au-aix5l-lvm.html

Your vmstat -v would be more helpful if you take it before and after the load? i.e. before you start the process that is creating the load and after you stop it (or after a while of its startup).

waiting for your info...

Regards,
Khalid

rehaanali · Aug 8, 2008

Hi Khalid,

1. QAD MfgPro QA Application is running on both the servers

2.I killed some process which are creating more loads but no use

3. vmstat 1 10 on first server

System configuration: lcpu=12 mem=10240MB ent=3.00

kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------------------
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec
8 3 3659790 36914 0 353 0 0 0 0 168 220210 15550 40 48 8 3 4.74 158.0
5 3 3659715 32609 0 280 0 0 0 0 296 247018 20936 44 47 7 2 5.38 179.5
9 3 3661590 28026 0 168 0 0 0 0 667 230151 18516 43 48 6 4 5.36 178.5
9 3 3662378 24328 0 180 0 0 0 0 842 234256 18750 45 46 5 4 5.43 181.0
5 3 3662300 21929 0 90 0 0 0 0 1036 210637 17360 49 42 5 4 5.31 177.0
5 5 3659477 22152 0 376 0 0 0 0 957 305467 17647 52 39 5 4 5.33 177.7
4 4 3660208 18712 0 381 0 0 0 0 548 214014 19854 50 40 5 5 4.97 165.7
4 3 3659604 17146 0 435 0 0 0 0 1070 202181 17103 45 44 7 4 4.77 159.1
6 3 3656748 17806 0 206 0 0 0 0 450 152512 16254 45 43 8 3 4.53 151.0
4 3 3657026 14723 0 95 0 0 0 0 257 169090 17095 48 42 7 3 4.82 160.6

4.vmstat 1 10 On second server

System Configuration: lcpu=8 mem=7424MB

kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
5 0 4104059 46805 0 0 0 0 0 0 320 270724 9666 24 16 60 1
0 0 4104368 46596 0 0 0 0 0 0 340 104409 6210 15 13 72 0
4 0 4104136 46844 0 0 0 0 0 0 366 148509 12405 21 15 63 1
1 0 4104048 46874 0 0 0 0 0 0 330 94364 8266 17 15 68 0
1 0 4103864 47155 0 0 0 0 0 0 357 85737 9921 18 16 65 1
3 0 4103938 46919 0 0 0 0 0 0 357 143575 11544 22 16 61 1
0 1 4102099 48686 0 0 0 0 0 0 356 214282 6989 22 15 61 1
2 0 4103854 47044 0 0 0 0 0 0 369 112778 8054 19 16 66 0
3 0 4104100 46805 0 0 0 0 0 0 369 164337 8053 21 14 65 1
2 0 4106195 44787 0 0 0 0 0 0 321 101238 6461 21 15 64 0

5.on first server
lquerypv -M hdisk7
256

6. on second server:
lquerypv -M hdisk0
256

lquerypv -M hdisk1
256

Thanks for the Help.

Rehaan.

khalidaaa · Aug 10, 2008

There is no paging in your systems so it is definitely not paging problem.

The lquerypv shows that the max. LTG is 256. Are you utilizing this in your vg? Could you show the output of your vgs? (lsvg vgname)

Regards,
Khalid

mrn · Aug 11, 2008

My first thought is filesytem logs, are you getting any wrap errors in errpt?

Mike

"Whenever I dwell for any length of time on my own shortcomings, they gradually begin to seem mild, harmless, rather engaging little things, not at all like the staring defects in other people's characters."

mrn · Aug 11, 2008

Also find out what is causing the problem.

filemon -o /tmp/filemon.out -P -T 60000000 -O all
sleep 30
trcstop

Take a look at the output filemon.out, you'll see most active files.

Mike

"Whenever I dwell for any length of time on my own shortcomings, they gradually begin to seem mild, harmless, rather engaging little things, not at all like the staring defects in other people's characters."

THOR01 · Aug 11, 2008

I had an issue just last week were we were ftping large (10+GB)DB files from one server to another at the same time a diff server on the same DS4500 controller was experiencing 100% disk utilization. Which was actually be caused by the large FTP on the other two servers. The DS4500 never showed any errors but just happened to look at DS4500 perf and saw the Large ftp job using huge banwidth.

Just a thought.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Hdisk 100 % Busy

rehaanali

IS-IT--Management

THOR01

MIS

rehaanali

IS-IT--Management

khalidaaa

Technical User

rehaanali

IS-IT--Management

khalidaaa

Technical User

mrn

MIS

mrn

MIS

THOR01

MIS

Similar threads

Part and Inventory Search

Sponsor