Replacing characters in file 1

Calator · Apr 18, 2005

Guys, you would have gotten similar questions before, nevertheless, here it comes:

I need to process a file with different record types, and manipulate strings found at certain character positions within the file, by:
- converting to upper case;
- replacing any character not in the character set with a space

The rest of the record needs to be written back to file as is.

The processing varies depending on record type, which is in positions 1-2 of the record
The valid character set is: 0-9 A-Z a-z / - & . * and 'space'

Example:
- for record type '02', need to:
- process string in positions 34-433 by replacing non-standard characters with spaces and by converting all lower case to upper case
- process string in positions 600-1229 against the character set
- for record type '03', need to process string in position 57-229 the character set

Can anyone supply some sample code? Thanks

PHV · Apr 19, 2005

man awk

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244

Calator · Apr 19, 2005

PHV, this is the shortest answer I ever got in the forums.
Reluctanly I wrote the awk script but stopped at the point where I need to write an awk function that will examine its string argument, and replace any character not in the character set, with one space. Can anyone help with that? Thanks

PHV · Apr 20, 2005

A starting point:
function mySub(str, x){
x=str
gsub(/[^ 0-9A-Za-z/\-&.*]/," ",x)
return x
}

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244

Calator · Apr 20, 2005

Thanks,
For everyone's enjoyment(or horror) here is my effort, which seems to be working ok:

#!/usr/bin/awk -f
function tocset(mystring) {
#this function examines the "mystring" argument and
#replaces any character which is not in the character set, with a space
#the character set is: 0-9 A-Z a-z & / . * - and 'space'
#special awk characters need to be 'escaped' using '\'
#note that '-' had to be listed last as it was confusing awk, even if escaped
gsub (/[^0-9A-Za-z&\/\.\*\-]/, " ", mystring)
return mystring
}
# manipulation of records depending on record type:

# record '01' file header needs to have size of 86 bytes
substr($0,1,2)=="01" {printf("%-85s\n",substr($0,1,86))}
#
# record '02' needs to have:
# positions 34-433 checked for valid character set and converted to CAPS
# positions 463-530 checked for valid character set
# positions 600-1229 checked for valid character set
# record size of 1229 bytes
substr($0,1,2)=="02" {
mystring=substr($0,1,33) \
tocset(toupper(substr($0,34,400))) \
substr($0,434,29) \
tocset(substr($0,463,68)) \
substr($0,531,69) \
tocset(substr($0,600,630));
printf("%-1228s\n", mystring)
}
# record '03' needs to have:
# positions 57-229 checked for valid character set
# record size of 249 bytes
substr($0,1,2)=="03" {mystring=substr($0,1,56) \
tocset(substr($0,57,173)) \
substr($0,230,20);
printf("%-248s\n", mystring)
}

# record '99' file trailer needs to have size truncated to 89
substr($0,1,2)=="99" {printf("%-88s\n",substr($0,1,89))}

# end

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Replacing characters in file 1

Calator

Programmer

PHV

MIS

Calator

Programmer

PHV

MIS

Calator

Programmer

Similar threads

Part and Inventory Search

Sponsor