Unicode end of line character 3

richardii · Mar 16, 2001

I'm having problems with the unicode end of line character. AWK doesn't seem to want to accept it as the RS. It thinks I'm dealing with one record. I can't paste the file I want to process up here, because pasting it interprets the end of lines as carriage return, and look then it looks fine ! However, when I open the file in notepad they show up as black squares, and the whole thing reads as one record.

Any help much appreciated.

grega · Mar 20, 2001

A trick I use in vi is to specify this type of character as the control sequence ctrl-Vctrl-M ... you could try specifying that as your RS in the awk script.

Pressing ctrl-V tells it to expect a control character next, where ctrl-M is the control character.

Pressing ctrl-Vctrl-M should show up as ^M on the command line.

Greg.

richardii · Mar 20, 2001

Thanks for that. I'm running gawk on my pc, so all this talk of vi makes me shake!!

I used this code (from the guide):

function chr(c)
{
# force c to be numeric by adding 0
return sprintf("%c", c + 0)
}

and specified RS=chr(13)

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Unicode end of line character 3

richardii

Programmer

grega

Programmer

richardii

Programmer

Similar threads

Part and Inventory Search

Sponsor