Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Unicode end of line character 3

Status
Not open for further replies.

richardii

Programmer
Jan 8, 2001
104
0
0
GB
I'm having problems with the unicode end of line character. AWK doesn't seem to want to accept it as the RS. It thinks I'm dealing with one record. I can't paste the file I want to process up here, because pasting it interprets the end of lines as carriage return, and look then it looks fine ! However, when I open the file in notepad they show up as black squares, and the whole thing reads as one record.

Any help much appreciated.
 
A trick I use in vi is to specify this type of character as the control sequence ctrl-Vctrl-M ... you could try specifying that as your RS in the awk script.

Pressing ctrl-V tells it to expect a control character next, where ctrl-M is the control character.

Pressing ctrl-Vctrl-M should show up as ^M on the command line.

Greg.
 
Thanks for that. I'm running gawk on my pc, so all this talk of vi makes me shake!!

I used this code (from the guide):

function chr(c)
{
# force c to be numeric by adding 0
return sprintf("%c", c + 0)
}

and specified RS=chr(13)

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top