Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

use Encode: I need to change the encoding of a file to UTF-8

Status
Not open for further replies.

pkskytektip

Programmer
Apr 3, 2010
21
US
I have some HTML files that use at least three different languages and are generated by PERL from both a MySQL database and original HTML files.

This vast collection of HTML files has to be edited in order to transform some characters that are not valid UTF-8.

I have made a little progress in applying the Encode module from Perl. If I apply
Code:
$_ = encode( "UTF8", $_);
to a file handle representing a test file, I get all of my characters transformed to correct UTF-8 except the ISO-1189-2, latin 2 characters which are mangled. I get "??or??evi??" instead of "?or?evi?".

If I look at the original sample file in Notepad++ and look at the encoding as set in the Encoding Menu, it shows up as ANSI.

I don't see ANSI as being a supported encoding in the Encoding module. Is there another way to do this?

Any and all tips or clues would be appreciated.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top