Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Replacing characters in file 1

Status
Not open for further replies.

Calator

Programmer
Feb 12, 2001
262
AU
Guys, you would have gotten similar questions before, nevertheless, here it comes:

I need to process a file with different record types, and manipulate strings found at certain character positions within the file, by:
- converting to upper case;
- replacing any character not in the character set with a space

The rest of the record needs to be written back to file as is.

The processing varies depending on record type, which is in positions 1-2 of the record
The valid character set is: 0-9 A-Z a-z / - & . * and 'space'


Example:
- for record type '02', need to:
- process string in positions 34-433 by replacing non-standard characters with spaces and by converting all lower case to upper case
- process string in positions 600-1229 against the character set
- for record type '03', need to process string in position 57-229 the character set

Can anyone supply some sample code? Thanks

 
man awk

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
PHV, this is the shortest answer I ever got in the forums.
Reluctanly I wrote the awk script but stopped at the point where I need to write an awk function that will examine its string argument, and replace any character not in the character set, with one space. Can anyone help with that? Thanks
 
A starting point:
function mySub(str, x){
x=str
gsub(/[^ 0-9A-Za-z/\-&.*]/," ",x)
return x
}

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Thanks,
For everyone's enjoyment(or horror) here is my effort, which seems to be working ok:

#!/usr/bin/awk -f
function tocset(mystring) {
#this function examines the "mystring" argument and
#replaces any character which is not in the character set, with a space
#the character set is: 0-9 A-Z a-z & / . * - and 'space'
#special awk characters need to be 'escaped' using '\'
#note that '-' had to be listed last as it was confusing awk, even if escaped
gsub (/[^0-9A-Za-z&\/\.\*\-]/, " ", mystring)
return mystring
}
# manipulation of records depending on record type:

# record '01' file header needs to have size of 86 bytes
substr($0,1,2)=="01" {printf("%-85s\n",substr($0,1,86))}
#
# record '02' needs to have:
# positions 34-433 checked for valid character set and converted to CAPS
# positions 463-530 checked for valid character set
# positions 600-1229 checked for valid character set
# record size of 1229 bytes
substr($0,1,2)=="02" {
mystring=substr($0,1,33) \
tocset(toupper(substr($0,34,400))) \
substr($0,434,29) \
tocset(substr($0,463,68)) \
substr($0,531,69) \
tocset(substr($0,600,630));
printf("%-1228s\n", mystring)
}
# record '03' needs to have:
# positions 57-229 checked for valid character set
# record size of 249 bytes
substr($0,1,2)=="03" {mystring=substr($0,1,56) \
tocset(substr($0,57,173)) \
substr($0,230,20);
printf("%-248s\n", mystring)
}

# record '99' file trailer needs to have size truncated to 89
substr($0,1,2)=="99" {printf("%-88s\n",substr($0,1,89))}

# end
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top