Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

checking errpt

Status
Not open for further replies.

icu812

MIS
Sep 10, 2001
52
US
Does anyone have a script that checks erros in the errpt(AIX box) on a daily basis. I have approx. 32 AIX boxes and have been tasked with monitoring errpt on a weekly basis. Help! Thanks
 
E-mail me at WVerzal@komint.com and I'll send you my error log monitoring script.

Bill.
 
Here is a script that might work for you:

#!/bin/ksh

TOTALERRS=`errpt | grep -v "IDENTIFIER" | wc -l`
OLDERRS=`cat /usr/local/bin/errpt.count`
((NEWERRS=TOTALERRS-OLDERRS))

if [ ${NEWERRS} -gt 0 ]
then
errpt | grep -v "IDENTIFIER" | head -${NEWERRS} | cut -c 42- |
while read ERRMSG
do
echo "errpt:${ERRMSG}" | /usr/bin/mailx user@domain.com
done
fi

echo ${TOTALERRS} > /usr/local/bin/errpt.count
Regards,
Chuck

chuck.spilman@nokia.com
 
icu812:

I haven't tried Chuck's script, but I do use Bill's script. I start it up at boot time and have set in the script for it to run every five minutes. I started using it just as we were starting to have massive disk problems (cause by an overheated computer room) and it saved me and my users.

 
Here is a simple script that can be cron'd to run each day at 23:59. It mails the current days error log entries to the admin.

#!/bin/ksh
##########################################################################
#
# ckerrpt: This script will look for all errors in the errpt
# for the current day. It should be run at 23:59 to catch all
# errors that occurred that day.
#
##########################################################################
#set -x
tempfile=/tmp/cherrpt.log

#-------------------------------------------------------------------------
# Create header for mail
#-------------------------------------------------------------------------
print "host: "$(hostname) "has experienced the following errors ..." > $tempfile
print >> $tempfile

#-------------------------------------------------------------------------
# Create a date string that will match errpt date format MMDDHHMMYY
#-------------------------------------------------------------------------
today=$(date +%m%d)'[0-9][0-9][0-9][0-9]'$(date +%y)

#-------------------------------------------------------------------------
# Place each matching errpt line into the temp file
# Igonore tty errors because there may be too many of them and they are benign
# for the most part.
#-------------------------------------------------------------------------
errpt | egrep $today | egrep -v "tty" >> $tempfile

#-------------------------------------------------------------------------
# If there are more then two lines in the temp file then assume there were
# matches in the errpt and send mail.
#-------------------------------------------------------------------------
set $(wc -l $tempfile)
if [[ $1 != "2" ]] then
mail -s &quot;$(hostname): daily error report&quot; $ADMIN_MAIL < $tempfile
fi

rm $tempfile
 
The following is a script that gives a statistcis of &quot;Application terminated Abnormaly&quot; (core) errors in errpt.
Cut&paste the following into the terminal window:

#============================================
ksh

function core_source {
CORE_PATH=&quot;/tmp/core_path&quot;
CORE_SOURCE_LOG=&quot;/tmp/core_source.log&quot;
CORE_STATISTICS=&quot;/tmp/core_stat&quot;
CORE_SORT=&quot;/tmp/core_sort&quot;

#create statistics for all application cores for this host and add it to the main log:
echo &quot;-----------&quot;
echo &quot;Application cores statistics:&quot;
errpt -a -j C60BB505 |awk '/PROGRAM NAME/,/ADDITIONAL INFORMATION/' | grep -vE &quot;PROG|ADD&quot; | sort > $CORE_STATISTICS
cat $CORE_STATISTICS | sort -u > $CORE_SORT
for CORE_REASON in `cat $CORE_SORT |cut -d&quot; &quot; -f1`
do
CORE_COUNT=`grep $CORE_REASON $CORE_STATISTICS |wc -l`
printf &quot;%-20s %6i%s %8s \n&quot; $CORE_REASON $CORE_COUNT
done
#Now collect the sysdump and SYSINTR info:
echo &quot;-----------&quot;
}
core_source

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top