Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Tracking errpt report

Status
Not open for further replies.

samirm

Technical User
May 12, 2003
80
US
Hi Guru..

Presently I don't have any alerting mechanism for any issue with respect to AIX servers. Although we are using "nomn" to analyze the performance of the servers, but to notify we want something.

presently I am making the shell script and putting in crontab to do the monitroing of the file systems and if it is crossing the threshold value, is sending the alert.

Do you know if any tool available for AIX to do that ? Like from errpt if any alert is there, it should notify to the group to take action.

Thanks ..

Sam
 
We use nagios to monitor and alert on several items, the errpt among those items. This script (not mine) can also be used standalone.

Code:
#!/bin/ksh
# script that checks for new errpt
#
errptlast=/tmp/errptlast
errpttmp=/tmp/errpttmp
admin=root@localhost
#
# check if first run
errpt | egrep -v '^IDENTIFIER' > $errpttmp
if [ ! -f $errptlast ] ; then
        # first run generate a timestamp - 1 hour to fake last run
        phour=$((`date +%H` - 2))
        date +%m%d$phour%M%y > $errptlast
fi
lastts=`cat $errptlast`
count=`errpt -s "$lastts" | egrep -v '^IDENTIFIER' | wc -l`
exit1=$?
count=`echo $count`
countcrit=`errpt -s "$lastts" | egrep -v '^IDENTIFIER' | sed -n 's!.* \(P\) .*!\1!p' | wc -l`
exit2=$?
countcrit=`echo $countcrit`
if [ "$count" -ge 1 ]  ; then
                # determine level
                errpt -A -D -s "$lastts" | mail -s "$count new error reports of which $countcrit critical  on
`hostname` since $lastts" $admin
                date +%m%d%H%M%y > $errptlast
                if [ "$countcrit" -ge 1 ] ; then
                        echo "$count new error reports of which $countcrit critical generated since $lastts"
                        exit 2
                else
                        echo "$count New error reports generated since $lastts"
                        exit 1
                fi
fi
if [ "$count" == 0 ] && [ "$countcrit" == 0 ] && [ "$exit1" == 0 ] && [ "$exit2" == 0 ] ; then
        echo "No new Error Reports since $lastts"
        exit 0
fi
echo "Errpt Check failed"
exit 3

Be sure to change the admin=root@localhost to a more appropriate email address. Set this up to run in cron, say every 15 minutes or whatever you feel is appropriate. If it sees any new errpt messages from the last time it ran, it will email those errors to whomever is in the admin= list.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top