6E4 server +Exp Raid - performance issue

MoshiachNow · Jul 6, 2005

HI,
I need an advice here.
Its 6E4,4 CPU,4 GB memory+GIGA interface+IBM Exp Raid (14x140GB disks).
This machine is a file server for 20 OSX macs (using afpsrv protocol) and 10 AIX machines (mounting the server over NFS).
When under heavy load - the server WAIT seems to go too high - 40-60%.
From topas observations it looks like hdisk4 and hdisk5 (RAID volumes)are causing the high system WAIT.At that point these volumes sem to be over 90% busy,but these disks throughput is quite low (less then 10 MB/sec).
(BTH - their max tested throughput is normaly abobe 50 MB/sec).
=============================================
Topas Monitor for host: brisque05 EVENTS/QUEUES FILE/TTY
Wed Jul 6 14:38:49 2005 Interval: 2 Cswitch 4034 Readch 808.9K
Syscall 14687 Writech 48626
Kernel 5.8 |## | Reads 849 Rawin 0
User 16.2 |##### | Writes 331 Ttyout 0
Wait 64.2 |################## | Forks 10 Igets 1
Idle 13.6 |#### | Execs 19 Namei 1772
Runqueue 1.0 Dirblk 431
Network KBPS I-Pack O-Pack KB-In KB-Out Waitqueue 4.0
en0 52.0 464 447 36.0 68.0
lo0 10.5 160 160 11.0 10.0 PAGING MEMORY
et0 0.0 0 0 0.0 0.0 Faults 5513 Real,MB 4095
en3 0.0 0 0 0.0 0.0 Steals 0 % Comp 31.5
PgspIn 0 % Noncomp 65.6
Disk Busy% KBPS TPS KB-Read KB-Writ PgspOut 0 % Client 0.5
hdisk4 99.9 1721.8 306 268.0 3175.6 PageIn 126
hdisk5 91.9 431.9 107 755.9 108.0 PageOut 429 PAGING SPACE
hdisk1 3.9 32.0 7 0.0 64.0 Sios 413 Size,MB 5312
hdisk3 1.4 10.0 2 0.0 20.0 % Used 0.6
hdisk0 1.4 10.0 2 0.0 20.0 NFS (calls/sec) % Free 99.3
hdisk2 0.4 16.0 0 0.0 32.0 ServerV2 0
ClientV2 0 Press:
Name PID CPU% PgSp Owner ServerV3 0 "h" for help
java 32522 6.0 63.3 scitex ClientV3 0 "q" to quit
afpsrv 22326 1.5 10.3 macuser
QSserver 28918 1.5 11.1 scitex
CTLServer 31784 1.0 3.3 scitex
java 18880 0.7 55.3 scitex
afpsrv 40914 0.5 5.9 root
syncd 7608 0.4 0.6 root
afpsrv 40448 0.4 2.2 macuser
=============================================
The system parameters are :
vmo -a
memory_frames = 1048576
pinnable_frames = 961752
maxfree = 184
minfree = 120
minperm% = 10
minperm = 97804
maxperm% = 30
maxperm = 293416
strict_maxperm = 0
maxpin% = 80
maxpin = 838861
maxclient% = 30
lrubucket = 131072
defps = 1
nokilluid = 0
numpsblks = 1359872
npskill = 10624
npswarn = 42496
v_pinshm = 0
pta_balance_threshold = 50
pagecoloring = 0
framesets = 2
mempools = 1
lgpg_size = 0
lgpg_regions = 0
num_spec_dataseg = n/a
spec_dataseg_int = n/a
memory_affinity = 1
htabscale = -1
force_relalias_lite = 0
relalias_percentage = 0

ioo -a
minpgahead = 2
maxpgahead = 64
pd_npages = 65536
maxrandwrt = 0
numclust = 1
numfsbufs = 186
sync_release_ilock = 0
lvm_bufcnt = 9
j2_minPageReadAhead = 2
j2_maxPageReadAhead = 8
j2_nBufferPerPagerDevice = 512
j2_nPagesPerWriteBehindCluster = 32
j2_maxRandomWrite = 0
j2_nRandomCluster = 0
jfs_clread_enabled = 0
jfs_use_read_lock = 1
hd_pvs_opn = 6
hd_pbuf_cnt = 640

scraid0 Available 1Z-08 PCI 4-Channel Ultra3 SCSI RAID Adapter

lsattr -El scraid0
bb no BATTERY backed adapter True
bus_intr_lvl 99 Bus interrupt level False
bus_io_addr 0xfc00 Bus I/O address False
data_scrubbing yes Enable Data Scrubbing for RAID Arrays True
intr_priority 3 Interrupt priority False
rebuild_rate med Priority of drive rebuild True
stripe_size 64 Stripe unit size True

# lsattr -El hdisk4
pvid 0040800ac60bfcdd0000000000000000 Physical volume identifier False
queue_depth 8 Queue DEPTH True
raid_level 5 RAID level of the array False
read_ahead yes Read ahead enabled False
size 700067 Capacity of the array in MB False

# lsattr -El hdisk5
pvid 0040800ac627fa7b0000000000000000 Physical volume identifier False
queue_depth 8 Queue DEPTH True
raid_level 5 RAID level of the array False
read_ahead yes Read ahead enabled False
size 840080 Capacity of the array in MB False

Long live king Moshiach !

http://www.7for70.com/

RodKnowlton · Jul 6, 2005

RAID 5 writes, especially if their size isn't an integer multiple of the stripe size, are performance killers.

You can run filemon during one of the periods to determine just what filesystem is the culprit, then pursue more specific cures.

Also, if you get the Busy% and throughput cleared up, don't worry about Wait I/O%. The processor and memory will always[1] be faster than disk, so if a system is only serving files it'll always have wait time (which is just idle time with outstanding i/o requests).

[1] For certain values of "always". C'mon solid state disks!

Rod Knowlton
IBM Certified Advanced Technical Expert pSeries and AIX 5L
CompTIA Linux+
CompTIA Security+

MoshiachNow · Jul 9, 2005

thanks.

Long live king Moshiach !

http://www.7for70.com/

MoshiachNow · Jul 10, 2005

RodKnowlton ,

Did you see anything wrong with the above numbers regarding to "RAID 5 writes, especially if their size isn't an integer multiple of the stripe size, are performance killers." ?

Also - how can one explain random access from multiple clients dropping the RAID performance from 50MB/sec (sequential access,single client) to less then 10MB/sec ?

Long live king Moshiach !

http://www.7for70.com/

MoshiachNow · Nov 2, 2005

HI all,
still suffering from this RAid performnace drop from 50 to 6 MB/sec when 20 Mac clients are READING from it.

filemon (volumes in question are /dataVolumes/brisque05.4.0
and /dataVolumes/brisque05.3.0):

most Active Logical Volumes
------------------------------------------------------------------------
util #rblk #wblk KB/s volume description
------------------------------------------------------------------------
0.96 54008 0 2150.9 /dev/lv02 /dataVolumes/brisque05.4.0
0.87 125736 6528 5267.6 /dev/lv01 /dataVolumes/brisque05.3.0
0.00 32 0 1.3 /dev/hd2 /usr
0.00 0 24 1.0 /dev/hd8 jfslog

Most Active Physical Volumes
------------------------------------------------------------------------
util #rblk #wblk KB/s volume description
------------------------------------------------------------------------
0.96 54008 0 2150.9 /dev/hdisk5 N/A
0.84 125736 6528 5267.6 /dev/hdisk4 N/A
0.00 32 24 2.2 /dev/hdisk0 N/A
0.00 0 24 1.0 /dev/hdisk3 N/A

Detailed Logical Volume Stats (512 byte blocks)
------------------------------------------------------------------------

VOLUME: /dev/lv02 description: /dataVolumes/brisque05.4.0
reads: 6731 (0 errs)
read sizes (blks): avg 8.0 min 8 max 32 sdev 0.6
read times (msec): avg 5.050 min 0.106 max 374.009 sdev 8.496
read sequences: 6356
read seq. lengths: avg 8.5 min 8 max 32 sdev 2.1
seeks: 6356 (94.4%)
seek dist (blks): init 943940640,
avg 414583997.5 min 8 max 1320313528 sdev 367418110.7
time to next req(msec): avg 1.865 min 0.000 max 366.213 sdev 4.960
throughput: 2150.9 KB/sec
utilization: 0.96

VOLUME: /dev/lv01 description: /dataVolumes/brisque05.3.0
reads: 5603 (0 errs)
read sizes (blks): avg 22.4 min 8 max 256 sdev 42.2
read times (msec): avg 5.766 min 0.106 max 69.401 sdev 6.996
read sequences: 4185
read seq. lengths: avg 30.0 min 8 max 1024 sdev 72.9
writes: 204 (0 errs)
write sizes (blks): avg 32.0 min 32 max 32 sdev 0.0
write times (msec): avg 0.309 min 0.258 max 0.542 sdev 0.044
write sequences: 204
write seq. lengths: avg 32.0 min 32 max 32 sdev 0.0
seeks: 4389 (75.6%)
seek dist (blks): init 901064512,
avg 301557663.3 min 8 max 1224962272 sdev 244266169.0
time to next req(msec): avg 2.161 min 0.000 max 366.324 sdev 5.595
throughput: 5267.6 KB/sec
utilization: 0.87

Long live king Moshiach !

http://www.7for70.com/

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

6E4 server +Exp Raid - performance issue

MoshiachNow

IS-IT--Management

RodKnowlton

MIS

MoshiachNow

IS-IT--Management

MoshiachNow

IS-IT--Management

MoshiachNow

IS-IT--Management

Similar threads

Part and Inventory Search

Sponsor