Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to detect control character 1

Status
Not open for further replies.

ruxshin

Programmer
Apr 26, 2001
33
FI
Hi,

Is it possible to detect control characters, for example ^@ using AWK script?
If yes, how to do it?

ruxshin
 
Not sure if this will work in awk (works in vi, etc.) but if you prefix the control character you're looking for with the ctrl-v sequence, that may work. In some cases the presence of ctrl-v tells the command that the next character is a control character. Have a go anyway.

Greg.
 
Hi,

I've tried these:

if($0 ~ /\026/)
if($0 ~ /.*\026.*/)
if($0 ~ /.*\026/)
if($0 ~ /.*\026*.*/)

All failed to detect the control characters. Where have I go wrong?


ruxshin
 

Hi, ruxshin!

Maybe you use some older version of awk and it doesn't recognise special characters. But, gawk does. I use GNU version 3.0.3 of awk (GNU, thanks again!) and it works (awk '$0 ~ /\026/ { print "Here is!" }' inputfile).

Bye!

KP.
 
Hi, I am sending along a utility I wrote that should show one way to check for control characters. The key is in the 'ordinal' array in the BEGIN section. (I think I based the idea on Arnold Robbins' book 'Effective AWK Programming' -- see section 15.5 Translating Between Characters and Numbers).




Purpose:
This awk utility (visib.awk) can be used to make invisible characters (except spaces) visible. (It looks at characters in the ascii range 1 to 31 and 127 to 255). This is particularly useful for situations where a file contains hidden control characters, such as newlines and carriage returns. It can also be used to replace all instances of selected characters in the ranges mentioned above with another character, or null.

There are a few syntax options to use this script:




1)
To view invisible control characters, there are 2 approaches:




1 a)
% visib.awk filename
This syntax will print any hidden characters as numbers enclosed in angle brackets.

Example: visib.awk <myfile.txt>
Eg: <13> represents a carriage return (ie: ascii character #13).




1 b)
% visib.awk vischar=<display character(s)> <filename>
This syntax will print any hidden characters as the display character(s) that you provide.

Example: visib.awk vischar=@ myfile.txt
In this example, invisible characters will be displayed as an at sign ('@').
Note: For some display characters, such as the space (&quot; &quot;) and pound sign (&quot;#&quot;), you will need to enclose the argument in single quotes and add a backslash before the character.




2)
To replace all instances of selected characters with another character, there are 3 choices of syntax:




2 a)
% visib.awk oldchar=<old_char_num> newchar=new_char_num filename
This syntax will replace one selected control character (represented by old_char_num) with another selected control character (represented by new_char_num).

Notes:
Both arguments are numerical values representing ascii values.

You will want to redirect the output of this command to another file, which you can then use to overwrite the original, if that is what you want.

Example: To replace a carriage return (ascii 13) with a tab (ascii 9) you could do:

visib.awk oldchar=13 newchar=9 myfile.txt > mynewfile.txt





2 b)
% visib.awk oldchar=<old_char_num> newchar=space <filename>
This syntax is used to replace a selected control character (represented by old_char_num) with a space.

Notes:

The first argument (old_char_num) must be a numeric value representing an ascii value.

You will want to redirect the output of this command to another file, which you can then use to overwrite the original, if that is what you want.

visib.awk oldchar=13 newchar=space myfile.txt > mynewfile.txt





2 c)
% visib.awk oldchar=<old_char_num> newchar=null <filename>
This syntax is used to replace a selected control character (represented by old_char_num) with a null.

Notes:

The first argument (old_char_num) must be a numeric value representing an ascii value.

You will want to redirect the output of this command to another file, which you can then use to overwrite the original, if that is what you want.

visib.awk oldchar=13 newchar=null myfile.txt > mynewfile.txt


#!/usr/bin/awk -f

BEGIN{
for(i=0; i<=255; i++)
{
t=sprintf(&quot;%c&quot;, i);
ordinal[t]=i;
}
}


{
outstr=&quot;&quot;;
for(i=1; i<=length($0); i++)
{
c=substr($0,i,1);
charnum=ordinal[c];
if( (charnum<=31) || (charnum>=127) )
{
if(oldchar != &quot;&quot; && newchar != &quot;&quot; && charnum==oldchar)
{
if( newchar==&quot;space&quot; )
{
outstr=outstr &quot; &quot;;
}
else if ( newchar==&quot;null&quot; )
{
outstr=outstr &quot;&quot;;
}
else
{
ch=sprintf(&quot;%c&quot;,newchar + 0)
{
outstr=outstr ch;
}
}
}
else if( vischar!=&quot;&quot; )
{
outstr=outstr vischar;
}
else
{
charval=sprintf(&quot;<%d>&quot;,charnum);
outstr=outstr charval;
}
}
else
{
outstr=outstr c;
}
}
printf(&quot;%s\n&quot;,outstr);
}
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top