Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

AIX: how to script: Need to extract specific txt into new file? 1

Status
Not open for further replies.

Guest_imported

New member
Jan 1, 1970
0
I have a file with first column all acct#'s and second column text. I need to extract each section from one account to the other into separate files named by the acct#.

I have very minimal scripting experience. I'd appreciate any help. I am familiar with vi.
Thanks in advance

Ex. File: comt.out

12345M xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
67890M xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
etc..
 
Here is a simple script to get you started on your way to doing what you want.

This works in ksh sh bash and zsh

# start script

while read line
do
if echo $line | grep "^[0-9]" > /dev/null
then
fout=`echo $line | awk ' {print $1} '`
fileout=acct_${fout}
fi

echo $line >> $fileout

done < myaccount_filename


The while and done lines do the input redirection for you from the text file and will exit when there is no more file to consume. The first &quot;if&quot; clause looks for a numeric character at the first position on the line. If it sees a number there it uses awk to take the first column of the line and stores it to the variable named fout. In this case I put the values together in another variable for the new filename. You can do this in one operation, but the interpreter works no faster that way and this is clearer to read. Finally, the last line of the while clause puts the text of each line in the correct file.

I put the acct_ before the file names to make it easy to delete or work with the files later. A common prefix or suffix is a big help with file manipulation.

You might notice that the tabs aren't preserved with some shells. If this is the case, there are some ways to make sure that they get put in. For instance when you do the echo you can manually put them in:

echo &quot;\t$line&quot; >> $fileout

The characters &quot;\t&quot; represent a tab character.

There are also some other things that you would need to do to this script to make sure that you don't overwrite your files or add too much data to them. Notice the &quot;>>&quot; in the echo line. This &quot;appends&quot; to the file. If you run your script twice on the same base file you will put data into your account files twice.

If you would like contact the forum again and I can do this a bit better for you. I'll need to know the unix you are using and the shell that you intend to use.
 
Thanks for the reply grafman,
But I've been looking at this script and not sure where to tell it what file to look in.
Thanks
 
OK, disregard me previous post, i figured that out.
BUT, I want the script to search for the account# with the M at the end. I don't want it to search just for numbers at the beginning of the line. How do I specify that?

Thanks
 
if all account numbers are 5 digits you could use the
following grep expression

grep &quot;^[0-9][0-9][0-9][0-9][0-9]M&quot;

stan
 
Not all the account numbers are the same amount of digits.
Then what do I do??
 
or use
Code:
grep &quot;^[0-9]\{1,5\}M&quot;
to match from 1 to 5 digits in the account number.
CaKiwi
 
Now, I have a question regarding the naming of the file.

Currently the script looks like this:
#!/bin/ksh
#start script

while read line
do
if echo $line | grep &quot;^[0-9]\{1,5\}M&quot; > /dev/null
then
fout=`echo $line | awk '{print $1}'`
fileout=wp${fout}.0
fi
echo $line >> $fileout
done < cmtrpt.out

I want the fileout name to be wp and the account number without the M and a .0
How could I do that while still leaving the grep for the M?
 
I haven't tested this but try replacing the awk with
Code:
awk '{print substr($1,1,length($1)-1}'
Hope this helps. CaKiwi
 
Worked great. But right before -1 I had to insert a ) .

awk '{print substr($1,1,length($1))-1}'

THANKS to ALL for your help!!

antman
 
OOPPS... once I looked into the files, I found that the -1 in the awk stmt took 1 away from the account number as the file name.

filename: wp19035.0
Contents: 19036M bla blah .........

??
 
Sorry about that. You needed to add the closing paren after the -1 not before. CaKiwi
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top