Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Unicode problems/differences between TCL8.0 and 8.3 - please help

Status
Not open for further replies.

Guest_imported

New member
Jan 1, 1970
0
Hi..

I´m hopeing someone can help me!!

I am using the same two procs in different unix machines, with the same two files which contain unicode characters (the character in question is the ° char.)

However TCL 8.0 and TCL 8.3 seems to treat these characters differently.

In Tcl 8.0 the character, when read in from a file, echos the file contents as they are.
>set fh [open unicodefile]
>read $fh
°Black

When I try to do this in Tcl 8.3 however, when I echo the contents, I get an extra character (Â)
>set fh [open unicodefile]
>read $fh
°Black

Does anyone know why this is occuring??? I'm running both on Sun, and have checked that all the locale environment variables are the same (i have tried setting LANG to be both en_US and en_UK)

I have also tried to fconfigure the encoding type in 8.3 before I read it in as such:
>set fh [open unicodefile]
>fconfigure $fh -encoding iso8859-1
>read $fh
°Black

To be sure that this actually did something I also tried to fconfigure the encoding to be everything from 1 - 9, and iso8859-5 gave me a different result (so at least I know that it is doing something!!)

If anyone can please help, or at least point me into a direction to look.. i've tried every search engine I can find, but havent been able to find any answers...I really need this to work on Tcl 8.3 because we are upgdrading our systems to work on Tcl 8.3...

thanks alot
Isa
 
I tried running the following script which
writes files of ascending, descending and
random character codes, and then reads them
back in to check. Results:

8.3.3 - no errors
8.0p2 -
8.0.4 - many errors, which on further
investigation are due to the fact
that character 0 (NULL) is not
written.

So it looks like 8.3.3 handles binary data
well.

Looking at your posting, are you sure that
the file I/O is different, or just how the
non-standard ASCII codes are displayed in
the console? You may get a clearer picture
by using [scan <string> {%c} ...] to get the
character codes of your binary data.

Script:

#!/bin/sh
# Following line is executable by sh but not by tclsh ... exec tclsh &quot;$0&quot; &quot;$@&quot;

puts &quot;Testing binary I/O on [set tcl_patchLevel]&quot;

set fp [open ascii w]
fconfigure $fp -translation binary
for {set i 0} {$i < 256} {incr i} {
puts -nonewline $fp [format {%c} $i]
}
close $fp

set fp [open ascii r]
fconfigure $fp -translation binary
set s [read $fp]
close $fp

set i 0
foreach c [split $s {}] {
if {[scan $c {%c} j] != 1} {
error &quot;Unable to scan character!&quot;
}
if {$i != $j} {
puts &quot;$i BAD - j = $j&quot;
}
incr i
}

puts &quot;done ascending ($i chars)&quot;

set fp [open ascii w]
for {set i 255} {$i >= 0} {incr i -1} {
puts -nonewline $fp [format {%c} $i]
}
close $fp

set fp [open ascii r]
fconfigure $fp -translation binary
set s [read $fp]
close $fp

set i 255
foreach c [split $s {}] {
if {[scan $c {%c} j] != 1} {
error &quot;Unable to scan character!&quot;
}
if {$i != $j} {
puts &quot;$i BAD&quot;
}
incr i -1
}

puts &quot;done descending ([expr {255 - $i}] chars)&quot;

set fp [open random_ascii w]
expr srand (88888)
for {set i 0} {$i < 5000} {incr i} {
puts -nonewline $fp [format {%c} [expr int (floor (rand () * 256))]]
}
close $fp

set fp [open random_ascii r]
fconfigure $fp -translation binary
set s [read $fp]
close $fp

set i 0
expr srand (88888)
foreach c [split $s {}] {
if {[scan $c {%c} j] != 1} {
error &quot;Unable to scan character!&quot;
}
if {[expr int (floor (rand () * 256))] != $j} {
puts &quot;$i BAD&quot;
}
incr i
}

puts &quot;done random ($i chars)&quot;
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top