I have an Aix 5.1 server (32 bits kernel) that run Oracle 8 databases on JFS2 filesystems. I've recently observed that system hangs while creating large (2 Go and more) datafiles.
Performing tests, dd command do the same: the shell prompt is coming back very fast (just like create datafile statement do), but system is freezing until all blocks are really created on disk (I suppose). CPU activity grows up to "95% wait i/o", disk activity is 100% (~ 9500K written per sec) and new processes are queued, even if they do not need this disk.
On the other hand, cp command on this big file works fine (does not give back the shell prompt very fast, but do not disturb other processes).
This JFS2 filesystem is on hdisk2, which is a 128 Go LUN. We use a FC link to a StorageTek array. On this array, the LUN is defined as 16 x (disk of 8 Go, model 3390) (virtual, as in fact this RAID 5 array is physically made of 18 Go hard disks ...)
Of course, IBM, StorageTek and Oracle supports can't help arguing and nothing comes.
My questions are:
1. I've performed tests and noticed that system hangs appear on whatever physical support I use when it's a JFS2 filesystem (even on a low charged system). Has anyone already noticed it, and what did he do ? I have just ordered ML 3 and I'd like to have advices from peoples who may have encountered this trouble.
2. Which values should I use to best set AIO servers ? Do I compute total aio=(number of aio servers per disk)x(number of disks ?) on:
-a- number of LUN (=1), quite small
-b- physical number of disks in use in RAID array (=7), what I did
-c- number of disks model 3390 defined in RAID array (=16), quite high
What about playing with priority of aio processes ?
Any idea is usefull. Thanks in advance.
Performing tests, dd command do the same: the shell prompt is coming back very fast (just like create datafile statement do), but system is freezing until all blocks are really created on disk (I suppose). CPU activity grows up to "95% wait i/o", disk activity is 100% (~ 9500K written per sec) and new processes are queued, even if they do not need this disk.
On the other hand, cp command on this big file works fine (does not give back the shell prompt very fast, but do not disturb other processes).
This JFS2 filesystem is on hdisk2, which is a 128 Go LUN. We use a FC link to a StorageTek array. On this array, the LUN is defined as 16 x (disk of 8 Go, model 3390) (virtual, as in fact this RAID 5 array is physically made of 18 Go hard disks ...)
Of course, IBM, StorageTek and Oracle supports can't help arguing and nothing comes.
My questions are:
1. I've performed tests and noticed that system hangs appear on whatever physical support I use when it's a JFS2 filesystem (even on a low charged system). Has anyone already noticed it, and what did he do ? I have just ordered ML 3 and I'd like to have advices from peoples who may have encountered this trouble.
2. Which values should I use to best set AIO servers ? Do I compute total aio=(number of aio servers per disk)x(number of disks ?) on:
-a- number of LUN (=1), quite small
-b- physical number of disks in use in RAID array (=7), what I did
-c- number of disks model 3390 defined in RAID array (=16), quite high
What about playing with priority of aio processes ?
Any idea is usefull. Thanks in advance.