Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Read, Check string, Depending on String write 1st or 2nd file

Status
Not open for further replies.

txwaiting4u

Programmer
Jun 28, 2002
17
US
I am a complete newbie at UNIX scripting. I have only written some really basic scripts but now have to figure out a solution that our ETL solution cannot do. We use ksh. This is my need:

1. Parameters passed to script for the following:
xferin ---> the directory where the input file is to be read from

src ---> the directory where the 2 output files should be written in format MYFILE1_EXTRACT_ccyymmdd.dat and MYFILE2_EXTRACT_ccyymmdd.dat

arch ---> the directory where the input file will be zipped and archived to

extractfilename ---> the input file to the script to be read in format #####_MYFILE_EXTRACT_ccyymmdd.dat (##### is actually 30 characters long for)

2. Read each line of the input file. If the rec type in column 29 (1 character) = 1 then write the line to output file 1 (MYFILE1). If rec type = 2 then write the line to output file 2. If rec type is anything other than 1 or 2, ignore record (shouldn't occur but need for exception handling).

3. The input file is in binary but the rec type is character.

4. The output files must remain in binary.

5. All files are .dat files -- (like .txt) -- with no delimiters. Fields up to the rec type are fixed length.

Probably alot more info than anyone needs but not sure if I have to consider some of these things in ksh scripting. I'm a mainframer expert but learning tons about UNIX these days.

Any bits or pieces of a solution would be a start. I have worked for the last few hours and my script is appalling in regards to syntax errors. Also, when executing the script, how do you pass the parameters into the script? I have been trying --> ksh my_script.sh|param1|param2|...

Thank you in advance. I will now continue scouring the rest of the forum.

Sheila
 
Let's see your code so far then so we can point you in the right direction. :)

If you put a ksh shebang at the beginning the script, e.g.

[tt]#!/bin/ksh[/tt]

...then you don't need to run it using ksh. Just make the script itself executable and it will be interpreted by ksh.

Just supply the parameters separated by spaces, then refer to them in the script as $1, $2, etc.

[tt]./my_script.ksh param1 param2 ...[/tt]

However, processing binary files could be difficult in Ksh; I tend to go for C whenever I have to manipulate binary data.

Annihilannic.
 
Don't laugh because I have messed around with this for a while and finally figured I better get the basics to run first...I am probably waaaayyy off. Thank you for the help.

# This script will take the XXX.dat file and split into 2
# files based on rec type. If rec type = 1, then record
# prints to XXX1.dat file. If rec type = 2, then record
# prints to XXX2.dat file. If rec type = anything else;
# ignore.

#echo "Enter XFERIN path: " | read xferindir
#echo "Enter SRC path : " | read srcdir
#echo "Enter ARCH path : " | read archdir
#echo "Enter ExtractNm : " | read extractfilename

#cd $xferindir
xferindir="/etl/dev/ecr/scripts"
extractfilename="input.txt"

while read extractfilename
do
if(substr($extractfilename,29,1)=="1")
awk 'substr($extractfilename,29,1)=="1"` $extractfilename > "dog1.dat"
if(substr($extractfilename,29,1)=="2")
awk 'substr($extractfilename,29,1)=="2"` $extractfilename > "dog2.dat"
fi
done
 
If you are going to use awk you may as well use it to do all of the file processing, since that's what it's for. For e.g. you have some code such as "[tt]if(substr($extractfilename,29,1)=="1")[/tt]" which is actually awk code, but not in an awk section of the script.

For example, something like this could replace the while/do/done loop:

[tt]awk '
substr($0,29,1)=="1" {
print > "dog1.dat"
}
substr($0,29,1)=="2" {
print > "dog2.dat"
}
' $extractfilename[/tt]

This is working on the assumption that the "records" in the file are separated by line feeds, i.e. a text file. If they are separated by other characters then things could get a little more interesting.

Annihilannic.
 
But if you remove the while/do, how does the script know to keep reading records?
 
Pretty awesome Annihilannic! I have it working!!!! Now if I can implement all these parameters, it'll be all good. Thank you for your help!
 
awk has a built in/implicit while/read/do loop. For example, the simple script awk '$1==1 {print}' filename means for every record (line) in filename, if the first parameter is equal to 1, print it.

Annihilannic.
 
I really appreciate the info. I have searched and searched forums, webpages, documentation, etc. for some insight. I figured that someone somewhere has had to do something similar. Wish I had just posted here first. Thanks so much!
 
One more question. In the substring, is that start at a zero offset or at 1?
 
Then there's a problem. Is there a size limitation for the output? I don't get any output, and I've verified the location of the 1 and 2 lookup column. It takes a bit to come back so it seems to be processing. I got it to work on my small dummy files and now am implementing it on my big file which is almost 9gig.

Here's what I have now....minus comments.

#!/user/bin/ksh

extractfilename="/etl/dev/ecr/xferin/00000000000000000003_ASO90AM_EXTRACT_20060516.dat"
outfile1="/etl/dev/ecr/xferin/ASO90AM1_EXTRACT_20060516.dat"
outfile2="/etl/dev/ecr/xferin/ASO90AM2_EXTRACT_20060516.dat"
awk '
substr($0,29,1)=="1" {
print > $outfile1
}
substr($0,29,1)=="2" {
print > $outfile2
}
' $extractfilename
 
You can't use shell variables in an awk script like that... because the awk script is surrounded in single quotes the $outfile values will not be expanded to the filenames. Do this instead:

[tt]awk -v outfile1=$outfile1 -v outfile2=$outfile2 '
substr($0,29,1)=="1" {
print > outfile1
}
substr($0,29,1)=="2" {
print > outfile2
}
' $extractfilename[/tt]

Also the path to your ksh looks incorrect, it should probably be /usr, not /user.

Annihilannic.
 
I changed as you described but still not getting my output files when I use the big input file. Even tried a smaller subset 7gig --> 70meg. Starting to wonder if it has to do with the input file being in binary. Not viewable on UNIX so what I think is a 1 or 2 is "encrypted". I appreciate all of your assistance. Must sleep now. Will address the binary thing tomorrow. Any thoughts on the binary file and how to make my script "like it"? Thanks so much.
 
The strings command perhaps? What is binary about it... does it have lots of non-printable characters? Perhaps you can post a cat -vet of some sample records...

Annihilannic.
 
Is there a way to convert the one specific column from binary to check for a match but only save that non-binary data in a variable and then the output files be all binary still?
 
Without some sample data and an example of the desired output I can only guess.

Annihilannic.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top