Compare & Repair Automatic Shell scripts error out

tekpr00 · Oct 4, 2017

Hello All,

I stumbled on some scripts from <

https://github.com/oraclesean/shareplex>

to help automate compare & repair.
This compare& repair process happens in Control Panel of a third party apps Shareplex.
The shell scripts suppose to be used for RAC nodes.
However, we are using it to compare & repair on the same server, but the server names are masked, so that source and target databases servers appear to be different names (seq1 & seq1-ds).
My Directories are mapped as follow:
seq1

racle::~->/orahome/app/oracle/home
drwxrwxr-x 2 oracle dba 96 Oct 3 13:19 log
-rwxrwxr-x 1 oracle dba 2894 Oct 3 13:42 dbbatch.sh
drwxrwxr-x 2 oracle dba 96 Oct 3 13:56 splex
-rwxrwxr-x 1 oracle dba 395 Oct 4 10:13 functions.sh
-rwxrwxr-x 1 oracle dba 8475 Oct 3 16:14 spxrepair.sh
my source database is:
sfgi:/orahome/app/oracle/product/10.2.05/dbq413:N
my target database is:
etxi:/orahome/app/oracle/product/11.2.04/db_1:N
Here is the first script, pleas note that I include comments and errors in applicable lines.
spxrepair.sh

Code:

# ************************************************************************************** #

# Set the Oracle environment:
. $HOME/dbbatch.sh
. $DBA/functions.sh
# Set the Shareplex environment:
. $DBA/splex/spxenv.sh
. $DBA/splex/spxfunctions.sh

#****************************************************************
# Functions
#****************************************************************
error()
{
  echo "$@" 1>&2
  if [ "$fmail" ]
    then mailto $fmail "$@" "Fatal error from $PROGRAM on $machine in $BBENV"
  fi
  exit 1
}

usage()
{
  version
  echo "usage: $PROGRAM -s source_SPX_sid -t target_SPX_sid "
  echo "                -l log_directory -n source_nodelist -a header_file "
  echo "                [-e notification_email] [-f failure_email] "
  echo "                [--clean] "
  echo " "
  echo "Arguments:"
  echo "   -s source_SPX_sid     Shareplex SID of the source database "
  echo "   -t target_SPX_sid     Shareplex SID of the target database "
  echo "   -l directory          Shared SPX directory to store repair files "
  echo "   -n source_nodelist    Comma-delimited list of source nodes "
  echo "   -a header_file        Full path and file name for file containing script "
  echo "                         headers, for creating the repair script "
  echo " "
  echo "Optional arguments: "
  echo "   -e notification_email Comma-delimited list of emails for notification of a repair \(target\) "
  echo "   -f failure_email      Comma-delimited list of emails to notify if errors "
  echo "   --clean               Remove files on successful repair "
  echo " "

}

usage_and_exit()
{
  usage
  exit $1
}

version()
{
  echo " "
  echo "$PROGRAM version $VERSION"
  echo " "
}

# Set up the environment. Define default values here.
PROGRAM=`basename $0`
VERSION=1.2
basename=spxrepair
clean=
email=
fmail=
header_file=
machine=$BBHOST
nodelist=
repair_dir=
sid_source=
sid_target=
headers_written=

# Get command line arguments:
while getoptex "a: clean; e: f: h; help; l: n: s: t: v; ?;" "$@"
do
  case $OPTOPT in
    s                ) sid_source="$OPTARG"  ;;
    t                ) sid_target="$OPTARG"  ;;
    l                ) repair_dir="$OPTARG"  ;;
    n                ) nodelist="$OPTARG"    ;;
    a                ) header_file="$OPTARG" ;;
    e                ) email="$OPTARG"       ;;
    f                ) fmail="$OPTARG"       ;;
    clean            ) clean=1               ;;
    h                ) usage_and_exit 0      ;;
    help             ) usage_and_exit 0      ;;
    '?'              ) usage_and_exit 0      ;;
    v                ) version               ;;
  esac
done
shift $[OPTIND-1]

# Validate command line arguments.

# Set up the environment; warn if an invalid SID is supplied
if [ ! "$sid_source" ]
  then error "A source SID to repair is required"
elif [ ! "$sid_target" ]
  then error "A target SID for show sync is required"
fi

# A local repair directory is required:
if [ ! "$repair_dir" ]
  then error "A local directory for repair files must be supplied"
elif [ ! -d "$repair_dir" ]
  then error "The supplied repair directory does not exist"
elif [ ! -w "$repair_dir" ]
  then error "The supplied repair directory is not writable"
fi

# At least one source node is required.
if [ ! "$nodelist" ]
  then error "At least one source node must be specified"
fi

# A header file is required. It should call whatever environment configs are necessary
# for running Shareplex on the source machine.
if [ ! "$header_file" ]
  then error "A header file must be specified"
elif [ ! -e "$header_file" ]
  then error "The specified header file $header_file does not exist"
elif [ ! -r "$header_file" ]
  then error "The specified header file $header_file is not readable"
fi

# Create the file extensions and files to be used:
now=`date '+%y-%m-%d_%H-%M-%S'`
repairfile=$repair_dir/$basename.$now.sh
logfile=$repair_dir/$basename.$now.log
resultfile=$repair_dir/$basename.$now.out

# Flag to determine if we've already written headers. This saves the need for creating 
# (and subsequently deleting) the repair and log files if there's nothing to repair:
headers_written=

# Loop through the list of nodes the user has supplied and see where sp_cop is running:
set -f; IFS=,
for node in $nodelist
do
   nodecheck=`ssh $node ps -ef | grep sp_cop | grep -v grep | wc -l`
   if [ "$nodecheck" == 1 ]
     then node_source=$node
          # If sp_cop is discovered running on a node, break the loop:
          break;
   fi
done
set =f; unset IFS

# Make sure there was a node returned:
if [ ! "$node_source" ]
  then error "Could not obtain the SPX source node from $machine in $BBENV"
fi

# On the target, loop through the results of the "show sync" command to get all tables
# that may be out of sync and build the repair and log files. Pull just the fields we need
# to discover the table and generate a notification email:
sp_ctrl show sync on $sid_target 2>&1 | grep 'out of sync' | awk '{print $1,$3,$9,$11,$12}' | while read line
do
  # Is this the first time through? If so, write headers.
  if [ ! "$headers_written" ]
    then cat $header_file > $repairfile
         if [ $? -ne 0 ]
           then error "Could not create the repair script file"
         fi
         # Add headers to the notification file:
         echo "The following tables were out of sync on $machine in $BBENV. A repair has been started" > $logfile
         echo "on host $node_source for database $sid_source." >> $logfile
         echo "" >> $logfile
         if [ $? -ne 0 ]
           then error "Could not create the email notification file"
         fi
         # Set the header flag so that we don't write them again:
         headers_written=1
  fi
  # Add the table to be repaired to the repair file:
  echo "$line" | awk '{print "repair " $2}' >> $repairfile
  # Create a verbose entry for the table:
  echo "$line" | awk '{print $1 " rows out of sync in " $2 " for queue " $3 " as of " $4 " " $5}' >> $logfile
done

# If a repair file was created, there are tables to be repaired. Run the repair script on the source via ssh:
if [ -e "$repairfile" ]
  then echo "EOF" >> $repairfile
       chmod 750 $repairfile
       ssh $node_source $repairfile > $resultfile
       if [ $? -ne 0 ]
         then error "Repair failed on host $node_source for SID $sid_source for script $repairfile"
       fi
       if [ "$email" ]
         then echo "" >> $logfile
              cat $resultfile | grep -v '^$' | grep -v '^*' >> $logfile
              mailto $email $logfile "Shareplex repair started from $machine in $BBENV" FILE
       fi
       # Delete the repair file.
       rm $repairfile
       if [ $? -ne 0 ]
         then error "Could not remove the repair file $repairfile from $machine in $BBENV"
       fi
       rm $resultfile
       if [ $? -ne 0 ]
         then error "Could not remove the result file $resultfile from $node_source in $BBENV"
       fi
       # Leave the log file unless the user has requested the "clean" option:
       if [ "$clean" ]
         then rm $logfile
             if [ $? -ne 0 ]
               then error "Could not remove the log file $logfile from $machine in $BBENV"
             fi
       fi
fi

exit 0

dbbatch.sh

Code:

# This is a set of parameters for setting up Oracle batch jobs.
CLIENT=client.com;                                                    export CLIENT
# I AM NOT QUIET SURE WHAT TO PUT AS THE CLIENT =???? 
MACHINE=`uname -n`;                                                   export MACHINE
DEFAULT_MAIL=me@email.com;                                            export DEFAULT_MAIL
DBA_MAIL=me@email.com,oncallguy@company.com;                          export DBA_MAIL
APP_EMAIL=me@email.com;                                               export APP_EMAIL
ALL_MAIL=$DBA_MAIL","$APP_EMAIL                                       export ALL_MAIL
ORACLE_BASE=/oracle;                                                  export ORACLE_BASE
[b]# My source data is in 10g and targer database is 11g.[/b]
ORACLE_HOME=$ORACLE_BASE/product/11.2.0/db_1;                         export ORACLE_HOME
ADR_BASE=$ORACLE_BASE;                                                export ADR_BASE
NODE=mydb1;                                                           export NODE
EDITOR=vi;                                                            export EDITOR
ORACLE_TERM=vt100;                                                    export ORACLE_TERM
TERM=vt100;                                                           export TERM
DBA=/orahome/app/oracle/home/;                                        export DBA
LOGS=/orahome/app/oracle/home/logs;                                    export LOGS
PATH=$ORACLE_HOME/bin:/grid/11.2.0/bin:$DBA:/usr/bin:/usr/sbin:/etc:/sbin:/bin
PATH=$PATH:/oracle/admin/scripts/
export PATH
CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib:$ORACLE_HOME/network/jlib
export CLASSPATH
LD_LIBRARY_PATH=$ORACLE_HOME/lib;                                     export LD_LIBRARY_PATH
ORAENV_ASK=NO;                                                        export ORAENV_ASK
# Skip the prompt for the ORACLE_SID. (Executed only for non-interactive jobs).
if [ "`tty`" != "not a tty" ]
then
        #       Prompt for the desired ORACLE_SID for interactive jobs
        . $ORACLE_HOME/bin/oraenv
fi
# Ensure one (and only one . is in the PATH)
case "$PATH" in
        *.*)            ;;                      # If already in the path?
        *:)             PATH=${PATH}.: ;;       # If path ends in a colon?
        "")             PATH=. ;;               # If path is null?
        *)              PATH=$PATH:. ;;         # If none of the above?
esac
umask 177
PS_OPTS="-ef"
export PS_OPTS

Here is the function.sh
#please note I copied the functions.sh from different site. So not sure about its appropriateness to my task.

Code:

# cd into a matching gem directory. Pass a grep compatible
# regular expression. If you don't pass anything, it goes
# to the root gem directory.
cdgem () 
{ 
    local gempath=$(gem env gemdir)/gems;
    if [[ $1 == "" ]]; then
        cd $gempath;
        return;
    fi;
    local gem=$(ls $gempath | grep $1 | sort | tail -1);
    if [[ $gem != "" ]]; then
        cd $gempath/$gem;
    fi
}

Here is the seq1:/orahome/app/oracle/home/splex/spxfunctions.sh

Code:

error()
{
  echo "$@" 1>&2
  exit 1
}

set_env()
{
  source ./spxenv.sh
}

check_shareplex()
{
 check=`ps -ef | grep "sp_cop" | grep -v grep | wc -l`
}

stop_shareplex()
{
  set_env
  now=`date '+%m/%d/%y %H:%M:%S'`
  echo "Stopping Shareplex at ${now}." | tee -a $logfile
  echo "This will take a few moments. Please be patient..." | tee -a $logfile

  $SP_SYS_PRODDIR/bin/sp_ctrl <<EOF >> $logfile
  shutdown
  exit
EOF

  EXITCODE=$?
  check_shareplex

    if [ $EXITCODE == 1 ] || [ $check -gt 0 ]
      then error "There was a problem stopping Shareplex!" | tee -a $logfile
    fi

  now=`date '+%m/%d/%y %H:%M:%S'`
  echo "Shareplex stopped successfully at ${now}." | tee -a $logfile
  exit 0
}

start_shareplex ()
{
  set_env
  now=`date '+%m/%d/%y %H:%M:%S'`
  echo "Starting Shareplex at ${now}." | tee -a $logfile
  echo "This will take a few moments. Please be patient..." | tee -a $logfile
  templog=$(mktemp)
  sh <<EOF > $templog
  $SP_SYS_PRODDIR/bin/sp_cop -u${BBHOST} &
EOF

  for ck in {1..12}
  do
    # Check every 5 seconds for up to a minute to see that sp_cop has started.
    sleep 5
    check_shareplex
    if [ $check -eq 1 ]
    then
      now=`date '+%m/%d/%y %H:%M:%S'`
      echo "Shareplex startup completed at ${now}." | tee -a $logfile
      # At this point the sp_cop process has started, but we want the startup
      # log to have all of the information about the startup. Sleep for 15
      # seconds and then cat the redirected nohup log into the main log file.
      sleep 15
      cat $templog >> $logfile
      exit 0
    fi
    done

  # If we get here, the startup didn't happen within 60 seconds.
  # Give up and throw an error (exit 1)
  error "There was a problem starting Shareplex!" | tee -a $logfile
}

truncate_shareplex_log()
{
  set_env
  now=`date '+%m/%d/%y %H:%M:%S'`
  echo "Truncating Shareplex log file at ${now}." | tee -a $logfile
  echo "This will take a few moments. Please be patient..." | tee -a $logfile

  $SP_SYS_PRODDIR/bin/sp_ctrl <<EOF >> $logfile
  truncate
  exit
EOF

  EXITCODE=$?
  check_shareplex

    if [ $EXITCODE == 1 ] || [ $check -gt 0 ]
      then error "There was a problem truncating the Shareplex log!" | tee -a $logfile
    fi

  rm $SP_SYS_VARDIR/log/event_log
  rm $SP_
  touch $SP_SYS_VARDIR/log/event_log

  now=`date '+%m/%d/%y %H:%M:%S'`
  echo "Shareplex event log truncated at ${now}." | tee -a $logfile
  exit 0

}

clean_shareplex()
{
  pkill -9 -f sp_
}

Lastly, here is seq1:/orahome/app/oracle/home/splex/spxenv.sh

Code:

 Customize for your environment.
#
# Get the environment and hostname:
[b]#I am on SUNos and my uname -a is SunOS seq1 5.10 Generic_xxxxxx-xx sun4v sparc sun4v[/b]
export BBENV=`hostname -d | cut -d"." -f2`
[b]# I intend to use `uname -n`  below[/b]
export BBHOST=`hostname -s`
#
export INST_TYPE="rdbms"
export LANG="en_US.UTF-8"
# By convention, our Shareplex SIDs all contain 'spx'. Get the correct local database meta that a) doesn't contain 'spx'
# and b) isn't ASM:
export ORACLE_DBNAME=`grep -v \^# /etc/oratab | grep -v spx | grep -v ASM | cut -d":" -f5`
export ORACLE_SID=`grep $ORACLE_DBNAME /etc/oratab | grep -v \^# | cut -d":" -f1`
export ORACLE_HOME=`grep $ORACLE_DBNAME /etc/oratab | grep -v \^# | cut -d":" -f2`
# Build the rest of the environment:
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export ORACLE_BASE=/oracle
export ORACLE_TERM=vt100
export ORAENV_ASK=NO
# Build the Shareplex variables:
export SP_COP_UPORT=2100
# The Shareplex install directories are NFS mounts and splex is reserved for them. This works in our environment, YMMV:
export SP_INSTALL_DIR=`df -h | grep \/splex | awk '{ print $NF }'`
# Get the ASM sid from the oratab:
export SP_OCT_ASM_SID=`grep ASM /etc/oratab | grep -v \^# | cut -d":" -f1`
# Get the Shareplex SID. spx is reserved for Shareplex in our environment, so this works throughout. YMMV:
export SP_SYS_HOST_NAME=`grep spx /etc/oratab | cut -d":" -f1`
# Version 8
export SP_SYS_PRODDIR=$SP_INSTALL_DIR/product/8
export SP_SYS_VARDIR=$SP_INSTALL_DIR/var
export IW_HOME=$SP_SYS_PRODDIR/util
# Set up the path:
export PATH=$ORACLE_HOME/bin:/shared/dbscripts:/usr/bin:/usr/sbin:/etc:/sbin:/bin:$SP_SYS_PRODDIR/bin
export TNS_ADMIN=$ORACLE_HOME/network/admin

This is how I am calling executing the code from /orahome/app/oracle/home
seq1

racle::~/splex-> ps -ef | grep sp_cop | grep -v grep| grep 2150
oracle 7712 1 0 Oct 02 ? 0:01 /orahome/app/SharePlex/11g/prod/.app-modules/sp_cop -u2150_ctxi
oracle 1240 1 0 Oct 02 ? 0:01 /orahome/app/SharePlex/10g/prod/.app-modules/sp_cop -u2150_sfgi

/orahome/app/oracle/home/spxrepair.sh -s 2150_sfgi -t 2150_ctxi -l /shared/dbscripts/splex/ \ -a /orahome/app/oracle/home/spxrepair_header -n seq1,seq1-ds -e myemail@myemail.com -f admin@myemail.com

johnherman · Oct 4, 2017

I assume the question is that your scripts are getting errors and did not previously get them. First, I would check the obvious. It seems that the environment has probably changed. Was there an OS upgrade or patch? Are all directory and folder permissions the same? Permissions to applications?

What exactly are the errors you are getting?

==================================
advanced cognitive capabilities and other marketing buzzwords explained with sarcastic simplicity

tekpr00 · Oct 4, 2017

Thanks for your reply,johnherman.

This is my first time of running the scripts in my enviroment.
The script did not report any error.
Just hung on executing :
/orahome/app/oracle/home/spxrepair.sh -s 2150_sfgi -t 2150_ctxi -l /shared/dbscripts/splex/ \ -a /orahome/app/oracle/home/spxrepair_header -n seq1,seq1-ds -e myemail@myemail.com -f admin@myemail.com

johnherman · Oct 5, 2017

Like most people on Tek-Tips, I have a FT job and thus don't have time to review several hundred lines of Oracle and Unix/Linux shell scripts. I suggest you test the individual components one at a time. Questions about Oracle environment will also be better suited for one of the Oracle forums.

==================================
advanced cognitive capabilities and other marketing buzzwords explained with sarcastic simplicity

SamBones · Oct 5, 2017

Your post was missing the two most important ingredients, a question, and error messages.

Also, as johnherman said, posting several hundred lines of code with an implied shoulder shrug is kind of missing the point of this site. It's not here to fix your code for you. While I like helping people with their problems, just dumping code gives us pretty much nothing to go on.

Anyway, you could try adding debug switches like "set -xv" to the very beginning to see what it's doing while it's doing it.

Also, this line in dbbatch.sh is missing a semicolon...

Code:

ALL_MAIL=$DBA_MAIL","$APP_EMAIL                                       export ALL_MAIL

It will still define ALL_MAIL, it just won't be what you expect and could cause problems later.

tekpr00 · Oct 5, 2017

Thanks to every one that help with this.
I have decided on using a different route meet the requirement above.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Compare & Repair Automatic Shell scripts error out

tekpr00

IS-IT--Management

johnherman

MIS

tekpr00

IS-IT--Management

johnherman

MIS

SamBones

Programmer

tekpr00

IS-IT--Management

Similar threads

Part and Inventory Search

Sponsor

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Compare &amp; Repair Automatic Shell scripts error out

tekpr00

IS-IT--Management

johnherman

MIS

tekpr00

IS-IT--Management

johnherman

MIS

SamBones

Programmer

tekpr00

IS-IT--Management

Similar threads

Log in

Part and Inventory Search

Sponsor

Compare & Repair Automatic Shell scripts error out