Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Insert data with values located n line further

Status
Not open for further replies.

clemhoff

Technical User
Jul 10, 2007
19
FR

Hello everyone;

I have plenty of very large (>60Mo) pseudo-xml files which looks as follows:

<?xml version="1.0" encoding="UTF-8"?>
... more lines here
</ObjectStyle>
<Object type="Field" flags="67108864" flags="67108864">
<Style>128</Style>
<Bounds top=" 24" left="138" bottom=" 38" right="391"/>
<FieldObj numOfReps="1" flags="32" input="0" Type="0">
<Name>Isotype</Name>
<Info>
<Field name="Isotype" id="33759" rep="1" max="1" tab="Reagent"/>
</Info>
</FieldObj>
</Object>
... etc
-------------------------------------------------

My goal is to insert in every line begining with "<Object type=" a combination of values located 6 lines further in the file.

Desired output:
<?xml version="1.0" encoding="UTF-8"?>
... more lines here
</ObjectStyle>
<Object type="Field" name="Obj.Reagent.Isotype" flags="67108864">[tab]<--- Target
<Style>128</Style>
<Bounds top=" 24" left="138" bottom=" 38" right="391"/>
<FieldObj numOfReps="1" flags="32" input="0" Type="0">
<Name>Isotype</Name>
<Info>
[COLOR=blue yellow]<Field name="Isotype" id="33759" rep="1" max="1" tab="Reagent"/>[/color][tab]<--- Source
</Info>
</FieldObj>
</Object>
... etc

is this possible to achieve???

I'd try to play with
Code:
awk '/Field name=\"/{print p[(NR+1)%6]}{p[(NR+1)%6]=$0}'
and
awk 'BEGIN {FS="(=\"|\" |\"\/)"} /<Field name=\"/{print "name=\"".$10"_"$2}'
but without success :(


Thanks in advance for any input and assistance.
(awk version 20040207 - Mac OS 10.4.11)

 
Excuse me to reconsider this subject, but is my problem really unrealizable?
If so, would you be so kind to point me in the right direction.

Thank you in advance.

 
You had a good idea there to use a circular buffer.

Try this perhaps:

Code:
awk '
        /Field name=\"/ {
                split($0,a)
                # strip off unwanted characters
                sub("^.*=\"","",a[2]);
                sub("^.*=\"","",a[6]);
                sub("\"$","",a[2]);
                sub("\"/.*$","",a[6]);
                # construct the new object name
                name="name=\"Obj." a[2] "." a[6] "\""
                # substitute it into the line 6 lines previous
                sub("flags=\"[0-9]+\"",name,p[(NR-6)%7])
        }
        # add line to buffer
        { p[NR%7]=$0 }
        # buffer full? start printing lines
        NR>6 { print p[(NR-6)%7] }
        # all done? print remaining buffered lines
        END { for (i=NR-5;i<=NR;i++) { print p[i%7] } }
'

This code makes some assumptions, i.e. that there are no spaces in the fields to be used, and that there are always the same number of fields on those lines, but it should get you started.

Annihilannic.
 
Annihilannic,

You make my day !

Unfortunately the fields contain spaces and other HTML delicacies which I need, as well as the field "flags" which is crushed in the last substitution. But I will manage this by myself.

Thank you for your input, code, explanations & ideas !


Regards
Clement

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top