Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

aix experts advice required

Status
Not open for further replies.

aixnag

MIS
Oct 30, 2001
99
SG

setup details: two aix boxes with 4.3.3-10 located in two different locations.

as per the client requirement. we triggering a cron job at regular 10 minutes for
checking the following things.
1. checking the node status using #ping -c 10 <hostname>
2. Checking the filesystem utilization of two filesystems example: fs1, fs2. using #rsh with df combination.
if filesystem 'fs1' within the threshold (90%) then transfer files from primary site to remote site.
if filesystem 'fs1' is reached threshhold then tranfer files to 'fs2' filesystem.

Each file may be 0 - 3mb file size. two locations are connected with 2MB leased line.
We are using 100mbps ethernet adapter at both locations.am sending three files concurrently to the remote site.

Problem: it works well. But once in three days or four days. the files are transfered to 'fs2' file system even though 'fs1' filesystem usage only 20%.

We stressed ourself more on network performance/throuhput. increased tx_que_size, tcp_sendspace,tcp_recvspace to 128kb. no pockets dropped at the interface level.

Any body know, why we are facing this problem.

is there any limitations on no. of rsh(remote)sessions or interms of sockets, threads.

Expert comments/advice required.

Thanks in advance

aixnag
 
hi,

Are you sure that the application ( if one ) writes to fs1 temporarily which utilises > 90% and when your script runs
it starts the copy? and when you look at it its only 20% used.

why don't you set a warning threshold , so that if fs1
reaches say 88% send oit a warning message ( via mail, or root etc) then you can check it before you actually do the copy .

what command are you using , and checking fro your scenario?

just a few thoughts
 
Hi,

If you are checking one of the fields from &quot;df&quot; output - it's possible that occasionally you may have some communication problem that brings the wrong value,or even garbage,that causes you comparation to fails and to write to fs2.

I would log the output of the &quot;rsh df&quot; to some log ONLY when the copy is done to fs2,then you'll find out one day what kind of value you got at that point. &quot;Long live king Moshiach !&quot;
h
 
hi levw & DSMARWAY

i believe you are going in right track. We are also suspecting the same. But how do we check and verify about
occational communication problem. Can you explore in this area.

Do we have any test method to see these communication problems.

Related script details :

there two functions 1. checking the FS KB free
2. checking the FS % used

It gives more clear picture.

******************************
function stbyfs_FileSysKBytesFree
{
(( $#==1 || $#==3 )) || return $FAIL

typeset -i kbFree
typeset dir=$1 remoteSystem=$2 remoteUser=$3

[[ -z &quot;$dir&quot; ]] && return $FAIL

# Set kbFree based on df -k output
if (( $# == 1 )); then
eval $($DF $dir | $AWK 'NR==2{printf&quot;kbFree=%d\n&quot;,$4}')
else
stbyfs_RemoteSystemAccessible $remoteSystem $remoteUser $TRUE $FALSE || return $FAIL
eval $($REMSH $remoteSystem -n -l $remoteUser &quot;$DF $dir&quot; | $AWK 'NR==2{printf&quot;kbFree=%d\n&quot;,$4}')
fi

# If eval fails, return FAIL
# NOTE: There is no checking for failure of remote df command
(( $? != $SUCCESS )) && return $FAIL

# If kbFree is >=0, output the value and return SUCCESS
(( kbFree >= 0 )) && { print -- &quot;$kbFree&quot;; return $SUCCESS; }

return $FAIL
}
************************************************
function stbyfs_FileSysPctUsed
{
(( $#==1 || $#==3 )) || return $FAIL

typeset -i pctFree
typeset dir=$1 remoteSystem=$2 remoteUser=$3

[[ -z &quot;$dir&quot; ]] && return $FAIL

# Set pctFree based on df -k output
if (( $# == 1 )); then
eval $($DF $dir | $AWK 'NR==2{printf&quot;pctFree=%d\n&quot;,substr($5,1,length($5)-1)}')
else
stbyfs_RemoteSystemAccessible $remoteSystem $remoteUser $TRUE $FALSE || return $FAIL
eval $($REMSH $remoteSystem -n -l $remoteUser &quot;$DF $dir&quot; | $AWK 'NR==2{printf&quot;pctFree=%d\n&quot;,substr($5,1,length($5)-1)}')
fi

# If eval fails, return FAIL
# NOTE: There is no checking for failure of remote df command
(( $? != $SUCCESS )) && return $FAIL

# If pctFree is >=0 and <=100, output the value and return SUCCESS
(( pctFree>=0 && pctFree<=100 )) && { print -- &quot;$pctFree&quot;; return $SUCCESS; }

return $FAIL
}

*****************************

Thanks in advance

aixnag

 
HI,

Once you will be logging the returned value AFTER writing to fs2,you can also log the output of the following to the same log:

1.errpt|head -10
2.entstat -drt ent0 |grep -i error
&quot;Long live king Moshiach !&quot;
h
 
hi levw,

Can u look at the script , can you make out something???

aixnag
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top