pkskytektip
Programmer
I have some HTML files that use at least three different languages and are generated by PERL from both a MySQL database and original HTML files.
This vast collection of HTML files has to be edited in order to transform some characters that are not valid UTF-8.
I have made a little progress in applying the Encode module from Perl. If I apply
to a file handle representing a test file, I get all of my characters transformed to correct UTF-8 except the ISO-1189-2, latin 2 characters which are mangled. I get "??or??evi??" instead of "?or?evi?".
If I look at the original sample file in Notepad++ and look at the encoding as set in the Encoding Menu, it shows up as ANSI.
I don't see ANSI as being a supported encoding in the Encoding module. Is there another way to do this?
Any and all tips or clues would be appreciated.
This vast collection of HTML files has to be edited in order to transform some characters that are not valid UTF-8.
I have made a little progress in applying the Encode module from Perl. If I apply
Code:
$_ = encode( "UTF8", $_);
If I look at the original sample file in Notepad++ and look at the encoding as set in the Encoding Menu, it shows up as ANSI.
I don't see ANSI as being a supported encoding in the Encoding module. Is there another way to do this?
Any and all tips or clues would be appreciated.