Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Help - Matching characters in a field

Status
Not open for further replies.

Guest_imported

New member
Jan 1, 1970
0
I need to remove/empty fields that contain any extended characters.

Something along the lines of
if($3 "contains any characters >0x7f")
$3 = ""
Out of my depth here I'm afraid.

I've got 80,000 records to import into a database and need to clean this one particular field - any help much appreciated.

Adi.
 
Someone may prove me wrong, but I don't think this is possible using awk. Awk is text based an so only recognizes ascii characters. I think you will need to use C or some other general purpose language. What other languages do you have available? CaKiwi
 

BEGIN {
# Field to check
fld2check=1;
# initialize ascii to decimal array
initascii2dec();
}

# Return: 1 - ASCII (0-255)
# 0 - > 255
function validAscii(str)
{
ret=1;

for(i=1;i<=length(str);i++)
if ( int(ascii2dec[substr(str, i, 1)]) > 255 ) {
ret=0;
break;
}

return (ret);
}

# initialize ascii to decimal map
function initascii2dec()
{
for(i=0;i<=255;i++) {
ch=sprintf(&quot;%c&quot;, i);
ascii2dec[ch] = i
}
}


{
if ( !validAscii($fld2check) )
$fld2check=&quot;&quot;;

print;
}

END {

}
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top