Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

AWK in Windows - length and substr functions with latin characters

Status
Not open for further replies.

mindwalker

Programmer
May 16, 2011
2
NL
Hi,

I'm trying to write an AWK script to run in Windows. I'm using version 3.1.6.

I'm having problems with latin characters. Basically when you have a latin character such as "À", awk sees it with a length of 2 instead of 1.

The problem is that I have a file where I find the header in the first line. From this header I can tell how many characters will be in each column per line for the rest of the file. When I encounter latin characters, since they count as 2 of length it will mess up my substr function calls when I use it to get each token specifically.

I've tried using the DOS chcp command to set country page 1252 (generic western) and also set the LANG and LANGUAGE environment variables to C. Also set LC_ALL to C. Nothing seems to work or change the behaviour of lenght and substr in AWK with the latin characters. This is a major annoyance... if anybody can point me in the right way that'd be immensely awesome :)

Joel
 
I guess it's a unicode issue.

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
well after failing forever to get it to work with the compiled windows version of gawk or nawk, I decided to install cygwin. It works straight out of the box with no fiddling around... not the perfect solution but it works :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top