Hi, I am trying to write the sed script which would exclude the 10-lines xml segment based on the given and found pattern. All 10-lines segments are indentically structured.
Here is an example of the first 3 records of master XML file printers.conf:
<Printer CL_010002>
Info pa010002
DeviceURI socket://pa010002ort
State Idle
Accepting Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
</Printer>
<Printer CL_010003>
Info pa010003
DeviceURI socket://pa010003ort
State Idle
Accepting Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
</Printer>
<Printer CL_010013>
Info pa010013
DeviceURI socket://pa010013ort
State Idle
Accepting Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
</Printer>
......
Here is my attempt to write the sed script to search for particular XML segment and remove it based on the found numeric 6-digit pattern (i.e. 010013, for example). I am having problem to create a delta file based on the excluded xml segments from the master file. My approach in the provided script below as following. First, I am determining the start and end line numbers of xml block to be deleted. However, as I am looping thru 6-digits patterns read from the explicitly defined file (*.dat), I need to be able to retain the unmatched 10-line xml blocks in their original position / sequence, while the 10-line xml blocks with found match should be excluded from the resulted xml file. Would you please help. Thanks in advance!============================================================
#!/usr/bin/ksh
CUPSDir=/path/to/file/
CUPS_TABLE=${CUPSDir}printers.conf
#cat -n *xml|egrep '<Printer CL|</Printer>' => extract number of lines
#1 cat -n printers.xml|egrep '<Printer CL|</Printer>'|grep 414401|nawk '{print $1}' => extract the number of start block
if [[ -f ${CUPSDir}printer.conf.update ]]
then
rm ${CUPSDir}printer.conf.update
fi
if [[ -f ${CUPSDir}printer.conf.read ]]
then
rm ${CUPSDir}printer.conf.read
fi
while read office
do
#sblock=0
#eblock=0
#echo locating office $office
#sblock=`cat -n $CUPS_TABLE|egrep '<Printer CL|</Printer>'|grep 414401|nawk '{print $1}'`
sblock=`cat -n $CUPS_TABLE|egrep '<Printer CL|</Printer>'|grep $office|nawk '{print $1}'`
if (test "$sblock" <> "")
then
let eblock=sblock+9
#echo start block is $sblock
#echo end block is $eblock
#2. sed -n 13,22p *xml =>extract the segment of needed branch
#This command will extract only the xml segments which need to be removed from the master file $CUPS_TABLE
sed -n "$sblock","$eblock"p $CUPS_TABLE >> ${CUPSDir}printer.conf.read
sed -n "$sblock","$eblock"!p $CUPS_TABLE > ${CUPSDir}printer.conf.read
#sed "$sblock","$eblock"d $CUPS_TABLE > ${CUPSDir}printer.conf.$office
#3. sed 's!13,22d' printers.xml => delete XML segment of closed office
#This command will remove only the xml segment based on the last office read in the list ${CUPSDir}printers_to_purge.dat
sed -e "$sblock","$eblock"d $CUPS_TABLE > ${CUPSDir}printer.conf.update
#This command will uppend duplicate xml segments which need to be removed the master file $CUPS_TABLE
sed -e "$sblock","$eblock"d $CUPS_TABLE >> ${CUPSDir}printer.conf.update
#===========================================================
else
echo ERROR - can not locate office $office in $CUPS_TABLE
fi
done<${CUPSDir}printers_to_purge.dat
exit
====================================================
Here is a content of the ${CUPSDir}printers_to_purge.dat
027105
028102
211300
211707
211719
211721
211725
211726
211760
211761
211762
211785
211814
211816
211817
211828
211831
211875
211876
211880
212151
213200
217700
219300
219802
321704
321712
321720
322105
540026
541704
541707
541715
547201
Here is an example of the first 3 records of master XML file printers.conf:
<Printer CL_010002>
Info pa010002
DeviceURI socket://pa010002ort
State Idle
Accepting Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
</Printer>
<Printer CL_010003>
Info pa010003
DeviceURI socket://pa010003ort
State Idle
Accepting Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
</Printer>
<Printer CL_010013>
Info pa010013
DeviceURI socket://pa010013ort
State Idle
Accepting Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
</Printer>
......
Here is my attempt to write the sed script to search for particular XML segment and remove it based on the found numeric 6-digit pattern (i.e. 010013, for example). I am having problem to create a delta file based on the excluded xml segments from the master file. My approach in the provided script below as following. First, I am determining the start and end line numbers of xml block to be deleted. However, as I am looping thru 6-digits patterns read from the explicitly defined file (*.dat), I need to be able to retain the unmatched 10-line xml blocks in their original position / sequence, while the 10-line xml blocks with found match should be excluded from the resulted xml file. Would you please help. Thanks in advance!============================================================
#!/usr/bin/ksh
CUPSDir=/path/to/file/
CUPS_TABLE=${CUPSDir}printers.conf
#cat -n *xml|egrep '<Printer CL|</Printer>' => extract number of lines
#1 cat -n printers.xml|egrep '<Printer CL|</Printer>'|grep 414401|nawk '{print $1}' => extract the number of start block
if [[ -f ${CUPSDir}printer.conf.update ]]
then
rm ${CUPSDir}printer.conf.update
fi
if [[ -f ${CUPSDir}printer.conf.read ]]
then
rm ${CUPSDir}printer.conf.read
fi
while read office
do
#sblock=0
#eblock=0
#echo locating office $office
#sblock=`cat -n $CUPS_TABLE|egrep '<Printer CL|</Printer>'|grep 414401|nawk '{print $1}'`
sblock=`cat -n $CUPS_TABLE|egrep '<Printer CL|</Printer>'|grep $office|nawk '{print $1}'`
if (test "$sblock" <> "")
then
let eblock=sblock+9
#echo start block is $sblock
#echo end block is $eblock
#2. sed -n 13,22p *xml =>extract the segment of needed branch
#This command will extract only the xml segments which need to be removed from the master file $CUPS_TABLE
sed -n "$sblock","$eblock"p $CUPS_TABLE >> ${CUPSDir}printer.conf.read
sed -n "$sblock","$eblock"!p $CUPS_TABLE > ${CUPSDir}printer.conf.read
#sed "$sblock","$eblock"d $CUPS_TABLE > ${CUPSDir}printer.conf.$office
#3. sed 's!13,22d' printers.xml => delete XML segment of closed office
#This command will remove only the xml segment based on the last office read in the list ${CUPSDir}printers_to_purge.dat
sed -e "$sblock","$eblock"d $CUPS_TABLE > ${CUPSDir}printer.conf.update
#This command will uppend duplicate xml segments which need to be removed the master file $CUPS_TABLE
sed -e "$sblock","$eblock"d $CUPS_TABLE >> ${CUPSDir}printer.conf.update
#===========================================================
else
echo ERROR - can not locate office $office in $CUPS_TABLE
fi
done<${CUPSDir}printers_to_purge.dat
exit
====================================================
Here is a content of the ${CUPSDir}printers_to_purge.dat
027105
028102
211300
211707
211719
211721
211725
211726
211760
211761
211762
211785
211814
211816
211817
211828
211831
211875
211876
211880
212151
213200
217700
219300
219802
321704
321712
321720
322105
540026
541704
541707
541715
547201