Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

inserting identification into each field

Status
Not open for further replies.

euro08

Technical User
Jul 1, 2008
3
ES
Hello,

I am a chemist not familiar with awk, but I think awk can be very helpful for my problem. In fact I have a text file with coordinates parameters for thousand of compounds as follows:

-ISIS- 06130516212D

21 24 0 0 0 0 0 0 0 0999 V2000
-5.4833 1.5125 0.0000 N 0 0 0 0 0 0 0 0 0
5.0375 1.5125 0.0000 N 0 0 0 0 0 0 0 0 0
7.6667 -3.0458 0.0000 N 0 0 0 0 0 0 0 0 0
(etc.)

M END
> <IDNUMBER> (X10052)
X10052

$$$$

-ISIS- 06130516212D

18 19 0 0 0 0 0 0 0 0999 V2000
-4.1958 3.7000 0.0000 N 0 0 3 0 0 0 0 0 0
-8.1250 1.4250 0.0000 C 0 0 0 0 0 0 0 0 0
-4.1958 8.2375 0.0000 C 0 0 0 0 0 0 0 0 0
-12.0625 8.2375 0.0000 S 0 0 3 0 0 0 0 0 0
(etc.)

M END
> <IDNUMBER> (L10021)
L10021

$$$$

-ISIS- 06130516212D

16 17 0 0 0 0 0 0 0 0999 V2000
-0.3333 -1.2208 0.0000 C 0 0 0 0 0 0 0 0 0
-8.1625 -1.2208 0.0000 N 0 0 3 0 0 0 0 0 0
7.5000 -1.2208 0.0000 N 0 0 3 0 0 0 0 0 0
3.5833 -3.4792 0.0000 N 0 0 0 0 0 0 0 0 0
-4.2500 -3.4792 0.0000 N 0 0 0 0 0 0 0 0 0
-0.3333 3.3042 0.0000 O 0 0 0 0 0 0 0 0 0
-16.0000 3.3042 0.0000 S 0 0 0 0 0 0 0 0 0
(etc.)

M END
> <IDNUMBER> (B10023)
B10023

$$$$

and so on.......

Information for each compound is comprised between the terms "ISIS" and "$$$$", and what I need is to copy the number after the "IDNUMBER" term just in the first line of each compound, before the "ISIS" line (if not I am not able to read correctly the file with chemistry software).

Thank you in advance for your help/tips!
 
If you can show us an example of how you need the output
to be given your example input text, that would make it easyer to understand.
Have you tried anything yourself?
 
Many thanks geirendre for your rapid response!

I have tried to understand how AWK works, and I have read the FAQ section, but for now all of this stuff is very hard for me, since I am not used with programming...

Please find here the output I would need for the first compounds I showed in my initial message:

X10052
-ISIS- 06130516212D

21 24 0 0 0 0 0 0 0 0999 V2000
-5.4833 1.5125 0.0000 N 0 0 0 0 0 0 0 0 0
5.0375 1.5125 0.0000 N 0 0 0 0 0 0 0 0 0
7.6667 -3.0458 0.0000 N 0 0 0 0 0 0 0 0 0
(etc.)

M END
> <IDNUMBER> (X10052)
X10052

$$$$

L10021
-ISIS- 06130516212D

18 19 0 0 0 0 0 0 0 0999 V2000
-4.1958 3.7000 0.0000 N 0 0 3 0 0 0 0 0 0
-8.1250 1.4250 0.0000 C 0 0 0 0 0 0 0 0 0
-4.1958 8.2375 0.0000 C 0 0 0 0 0 0 0 0 0
-12.0625 8.2375 0.0000 S 0 0 3 0 0 0 0 0 0
(etc.)

M END
> <IDNUMBER> (L10021)
L10021

$$$$

B10023
-ISIS- 06130516212D

16 17 0 0 0 0 0 0 0 0999 V2000
-0.3333 -1.2208 0.0000 C 0 0 0 0 0 0 0 0 0
-8.1625 -1.2208 0.0000 N 0 0 3 0 0 0 0 0 0
7.5000 -1.2208 0.0000 N 0 0 3 0 0 0 0 0 0
3.5833 -3.4792 0.0000 N 0 0 0 0 0 0 0 0 0
-4.2500 -3.4792 0.0000 N 0 0 0 0 0 0 0 0 0
-0.3333 3.3042 0.0000 O 0 0 0 0 0 0 0 0 0
-16.0000 3.3042 0.0000 S 0 0 0 0 0 0 0 0 0
(etc.)

M END
> <IDNUMBER> (B10023)
B10023

$$$$

In all cases the line situated under the term "ISIS" include a number that corresponds to the "IDNUMBER" field which is several lines below each time.
 
Something like this perhaps:

Code:
awk '
        # new compound, reset the counter
        /-ISIS-/ { i=0 }
        # store lines for this compound in line buffer
        { line[++i]=$0 }
        # grab the IDNUMBER from the line following it
        /<IDNUMBER>/ {
                getline
                id=$0
                line[++i]=$0
        }
        # end of compound, print out the buffer
        /\$\$\$\$/ {
                print id
                for (j=1; j<=i; j++) { print line[j] }
                # including the blank line after it
                getline; print
        }
' inputfile

Annihilannic.
 
Dear Annihilannic,

thank you very much for your message, your script works perfectly well!

I am very grateful to you because this script avoids a waste of time for me, and this encourages me to learn AWK, it is a very useful tool!

Thanks again,
euro08
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top