Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Selectively splitting a file with C-shell?

Status
Not open for further replies.

fusi0n

Technical User
Nov 1, 2006
10
US
I have a rather long csh script that works, but it's terribly ungraceful and takes a while from various loops. I only know enough code to get myself into trouble, so I'm looking for some guidance.

I have a large file that is separated at regular intervals by the same line, like this:

@<TRIPOS>molecule
name_000
blah
yadda
stuff
@<TRIPOS>molecule
name_001
blah2
yadda2
stuff2
@<TRIPOS>molecule
name_002
blah3
...

and so on for up to name_200

I need to split this file into individual files, named name_000.mol2, name_001.mol2, name_002.mol2 etc, each of which contain:

@<TRIPOS>molecule
name_000 (or name_whatever)
blah
yadda
stuff

Currently I grep -n 'TRIPOS>molecule' the big file into another file, then use the line numbers to cycle through and pull out lines x through y, naming each file based on what's in line x+1 (name_000), with .mol2 on the end. This takes an incredibly long time, and is rather clunky. I don't know if it's the file cycling or all the variables I pass back and forth that eats up the time, but if anyone knows of an easier way to split the big file up, I would greatly appreciative!

Any ideas?

Thanks!
 
man awk

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
I tried previously, but couldn't figure out a better solution using awk as the line numbers between each @<TRIPOS>molecule separator vary, as do the number of entries in the file.

I'll go back and try again though.
 
Something along the lines of
Code:
while read line
do
  [[ $line = '@<TRIPOS>molecule' ]] && continue
  case $line in
    name_* )
      OUTFILE=${name}.mol2
      echo '@<TRIPOS>molecule' > $OUTFILE
      echo $line >> $OUTFILE
      ;;
    *) echo $line >> $OUTFILE
  esac
done < infile

If any of the lines in infile contain spaces add
Code:
IFS='
'
before the loop
Not tested but I think you'll get the idea.

Ceci n'est pas une signature
Columb Healy
 
Sorry - one more thing - I know you specified 'C' shell - but run this under ksh

Ceci n'est pas une signature
Columb Healy
 
Unfortunately, I need to run this script under csh (or I have to convert a lot of other code too), BUT this does give me some good ideas about where to go from here.

Thanks so much!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top