Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

extracting text between patterns 1

Status
Not open for further replies.

philipose

Programmer
Dec 24, 2003
137
US
Hi,
I am a little stuck with a text extract issue. I have a files that would have a pattern as below


some text ...........

<error>
some message ..............
</error>

some text ...........

<error>
some message ..............
</error>

some text ...........


I am interested in extracting all the lines between <error> and </error>, including them. I can easily grep the line </error> or the line <error> but was not sure how to approach this. Is anything possible using sed or tr -d or something. Any suggestions would be great.

Thanks a lot
philipose
 
it can be done thru nested loop

Code:
rrr=$(cat file.txt)
found=0
for i in $r
do
if [[ $i == "<error>" ]]
then
found=1
continue
elseif [[ $i == "</error>" ]]
then
found=0
continue
fi

if [[ $found -eq 1 ]]
then
echo $i >>/tmp/file.errors
fi
done

cat /tmp/file.errors

By the way this is not tested script! but this is the logic and that's only one way of doing it! i think this can be done using one awk or sed statement!

Regards,
Khalid
 
Khalid,
Cool. I think I got what you meant. I should be able to use this logic.
Thanks again
Philipose
 
Khalid,
I ran into one issue. The output of cat was the whole file in one line, ie each line is not read separately. I think your approach needs lines to be taken on at a time, right ?

thanx
philipose
 
Then try this:

Code:
FILE=/yourfilepath
cat $FILE | while read i
do
if [[ $i == "<error>" ]]
then
found=1
continue
elseif [[ $i == "</error>" ]]
then
found=0
continue
fi

if [[ $found -eq 1 ]]
then
echo $i >>/tmp/file.errors
fi
done

cat /tmp/file.errors

Regards,
Khalid
 
Khalid,
Thanks for the tip. I made it work with arrays.
Thanks
Philipose

# Set IFS to newline to overcome spaces in directory names
OFS="$IFS"
# Set IFS to newline
IFS='
'

for LOGS in "$@"; do

set -A LINES `grep -E "^.*$" ${LOGS}`
LINE_NUM=0
ERROR=0

while [[ $LINE_NUM -lt ${#LINES[@]} ]]; do

if [[ "${LINES[$LINE_NUM]}" == "<error>" ]]; then
ERROR=1
fi
if [[ "${LINES[$LINE_NUM]}" == "</error>" ]]; then
echo "${LOGS}: `expr $LINE_NUM + 1` : ${LINES[$LINE_NUM]}"
ERROR=0
fi

if [[ $ERROR -eq 1 ]]; then
echo "${LOGS}: `expr $LINE_NUM + 1` : ${LINES[$LINE_NUM]}"
fi

LINE_NUM=`expr $LINE_NUM + 1`
done

done

# Restore original IFS
IFS="$OFS"
 
Your coding seems to be better than mine :)

Well done and good luck

Regards,
Khalid
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top