Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

sed script

Status
Not open for further replies.

malpa

Technical User
Feb 8, 2004
122
CO

Hi there

There is a file with many records. the records have structure with tags. there is a special tag , that have the information. from it, that I am trying to do, is taking the first field, some information of the 3 and 4 fields, using sed. the separator of this fields is an "*".

log_xx.xml

<msg time='2016-11-03T05:52:36.591-05:00' org_id='oracle' comp_id='tnslsnr'
type='UNKNOWN' level='16' host_id='tgestion1'
host_addr='xx.xx.xx.xx'>
<txt>03-NOV-2016 05:52:36 * (CONNECT_DATA=(SID=catrman2)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=xx.xx.xx.xx)
(PORT=57902)) * establish * catrman2 * 12505
</txt>
</msg>

the first field contain the date.
the second field contain a information like a SERVICE_NAME,PROGAM,HOST,USER
the third field contain information PROTOCOL, HOST, PORT

Iam using sed and awk. It function fine. i guess it may be improved, but it does not function in only using sed.

file.sed

#!/bin/sed -f
/<txt*/,/<\/txt>/ {
s/<txt>//g ; s/<\/txt>//g
p
}


sed -n -f ${DIR}file.sed < log_778.xml log_779.xml log_780.xml log_781.xml log_782.xml log_783.xml log_784.xml log_785.xml log.xml | gawk -F"*" ' BEGIN {}
NF==6&&$0!~/TIMESTAMP/{
sub("^ +","",$1)
fecha=substr($1,1,14)
data[1]=$2
data[2]=$3
k=split(data[1],ar,"[()]")
for ( i=1;i<=k;i++ ){
if( ar~/PROGRAM=/ && !(ar in d) ){
d[ar]
sub(/PROGRAM=/, "" , ar)
aa[1]=ar
}
if( ar~/HOST=/ && !(ar in d)){
d[ar];sub(/HOST=/, "" , ar)
aa[2]=ar
}
if( ar~/USER=/ && !(ar in d)){
d[ar];sub(/USER=/, "" , ar)
aa[3]=ar
}
if( ar~/SERVICE_NAME=/ && !(ar in d)){
d[ar];sub(/SERVICE_NAME=/, "" , ar)
aa[4]=ar
}
}
for (l in d) delete d[l]
k=split(data[2],ar,"[()]")
for ( i=1;i<=k;i++ ){
if( ar~/PROTOCOL=/ && !(ar in d) ){
d[ar];sub(/PROTOCOL=/, "" , ar)
aa[5]=ar
}
if( ar~/HOST=/ && !(ar in d)){
d[ar];sub(/HOST=/, "" , ar)
aa[6]=ar
}
if( ar~/PORT=/ && !(ar in d)){
d[ar];sub(/PORT=/, "" , ar)
aa[7]=ar
}
}
for (l in d) delete d[l]

print fecha";"aa[1]";"aa[2]";"aa[3]";"aa[4]";"aa[5]";"aa[6]";"aa[7]



}END{} ' | cut -d";" -f1,3,7 | sort | uniq -c

output
------

count date Hour host IP
86 01-NOV-2016 00;__jdbc__;xx.xx.xx.xx
322 01-NOV-2016 00;__jdbc__;xx.xx.xx.xx
222 01-NOV-2016 00;__jdbc__;xx.xx.xx.xx
2 01-NOV-2016 00;__jdbc__;xx.xx.xx.xx
68 01-NOV-2016 00;xxxxxxxx;xx.xx.xx.xx
12 01-NOV-2016 01;xxxxxxxx;xx.xx.xx.xx


I am trying to do the same with sed, the issue is that i have not able to do it.

I have a problems with the H,G and x variables. Finally i want to print the information find out per record, in one line. If the information is not found it, it must to pint ;;.

initial programm

myprogramm.sed

/<txt>/,/<\/txt>/{
s/^.*<txt>//g ; s/<\/txt>//g
/.* \* .* \* .* \* .* \* .* \* .*/ {
#/([^(|^)]*)/{
h
s/\(.*\) \* .* \* .* \* .* \* .* \* .*/\1/p
G
s/.* \* .*\((SERVICE_NAME[^(|^)]*)\).* \* .* \* .* \* .* \* .*/\1/p
G
s/.* \* .*\((PROGRAM[^(|^)]*)\).* \* .* \* .* \* .* \* .*/\1/p
G
s/.*\((HOST[^(|^)]*)\).* \* .* \* .* \* .* \* .*/\1/p
G
s/.*\((USER[^(|^)]*)\).* \* .* \* .* \* .* \* .*/\1/p
G
s/.* \* .* \* .*\((PROTOCOL[^(|^)]*)\).* \* .* \* .* \* .*/\1/p
G
s/.* \* .* \* .*\((HOST[^(|^)]*)\).* \* .* \* .* \* .*/\1/p
G
s/.* \* .* \* .*\((PORT[^(|^)]*)\).* \* .* \* .* \* .*/\1/p
G
#}
}

}


I would like this output, using sed
==================================

03-NOV-2016 14:22:10 ;(PROGRAM=) ;(HOST=__jdbc__); (USER=oracle) ; (PROTOCOL=tcp) ;(HOST=10.81.203.19) ; (PORT=44390)



Thanks a lot for your comments
Malpa
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top