Does anyone have a script that checks erros in the errpt(AIX box) on a daily basis. I have approx. 32 AIX boxes and have been tasked with monitoring errpt on a weekly basis. Help! Thanks
I haven't tried Chuck's script, but I do use Bill's script. I start it up at boot time and have set in the script for it to run every five minutes. I started using it just as we were starting to have massive disk problems (cause by an overheated computer room) and it saved me and my users.
Here is a simple script that can be cron'd to run each day at 23:59. It mails the current days error log entries to the admin.
#!/bin/ksh
##########################################################################
#
# ckerrpt: This script will look for all errors in the errpt
# for the current day. It should be run at 23:59 to catch all
# errors that occurred that day.
#
##########################################################################
#set -x
tempfile=/tmp/cherrpt.log
#-------------------------------------------------------------------------
# Create header for mail
#-------------------------------------------------------------------------
print "host: "$(hostname) "has experienced the following errors ..." > $tempfile
print >> $tempfile
#-------------------------------------------------------------------------
# Create a date string that will match errpt date format MMDDHHMMYY
#-------------------------------------------------------------------------
today=$(date +%m%d)'[0-9][0-9][0-9][0-9]'$(date +%y)
#-------------------------------------------------------------------------
# Place each matching errpt line into the temp file
# Igonore tty errors because there may be too many of them and they are benign
# for the most part.
#-------------------------------------------------------------------------
errpt | egrep $today | egrep -v "tty" >> $tempfile
#-------------------------------------------------------------------------
# If there are more then two lines in the temp file then assume there were
# matches in the errpt and send mail.
#-------------------------------------------------------------------------
set $(wc -l $tempfile)
if [[ $1 != "2" ]] then
mail -s "$(hostname): daily error report" $ADMIN_MAIL < $tempfile
fi
The following is a script that gives a statistcis of "Application terminated Abnormaly" (core) errors in errpt.
Cut&paste the following into the terminal window:
#============================================
ksh
function core_source {
CORE_PATH="/tmp/core_path"
CORE_SOURCE_LOG="/tmp/core_source.log"
CORE_STATISTICS="/tmp/core_stat"
CORE_SORT="/tmp/core_sort"
#create statistics for all application cores for this host and add it to the main log:
echo "-----------"
echo "Application cores statistics:"
errpt -a -j C60BB505 |awk '/PROGRAM NAME/,/ADDITIONAL INFORMATION/' | grep -vE "PROG|ADD" | sort > $CORE_STATISTICS
cat $CORE_STATISTICS | sort -u > $CORE_SORT
for CORE_REASON in `cat $CORE_SORT |cut -d" " -f1`
do
CORE_COUNT=`grep $CORE_REASON $CORE_STATISTICS |wc -l`
printf "%-20s %6i%s %8s \n" $CORE_REASON $CORE_COUNT
done
#Now collect the sysdump and SYSINTR info:
echo "-----------"
}
core_source
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.