Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

file name

Status
Not open for further replies.

imad77

Instructor
Oct 18, 2008
97
CA
Hi,

I experience an issue when I try to upload a file via a Perl/CGI script, the file name is changed, here is an example:

The original file is located in Windows system : région.txt
when I upolded it from a Web page to Linux server, the name is changed to :
r?gion.txt

Is it a way to keep this special character "é" on the name ?

Thanks

Imad
 
I saw something similar to this a while ago. I was using the wrong character set on an Oracle database, you create databases using a particular character set and changing aferwards is a pain..

I'd imagine that the Linux system is doing something simliar to that - changing the e-acute-accent character to something else. You could work around this by storing the orignal file names in a data-file/database table and then looking them up when you needed them.

(I ended up translating the # character to the £ character whenever I read it from the DB)

Mike

 
Test what the filename is on the CGI script immediately when the file is being uploaded to the server (i.e. get param("filefield") and see what its value is, directly submitted from the web form). This will see whether the ? was actually sent to the server by your web browser or not. If the server sees the accented e at that point, then you can rule out any client issue and comb through the rest of the code and see if you can't narrow it down somewhere else.

If everything is good and the filename from the form isn't tampered with anywhere else in the script before it's used in the open() command to write the file, then that would mean it's some kind of configuration on the server (check things such as: what filesystem is in use, whether there are any kernel-level security policies in place, etc.)

Also test the name again immediately before it's used in an open() command, to further verify that your script isn't doing the conversion itself before even saving it to disk.

Cuvou.com | My personal homepage
Code:
perl -e '$|=$i=1;print" oo\n<|>\n_|_";x:sleep$|;print"\b",$i++%2?"/":"_";goto x;'
 
It sounds to me like a simple character set issue.
I suspect that your "ftp" or "ssh" or "telnet" session is simply not configured to use the same character set as your web browser was.
The fact that we are seeing only one question mark makes me think we have only one byte for that character. That would probably rule out any Unicode encoding.
Remember that Windoze uses it's own character set (Windows-1252) and that Linux systems are unlikely to be setup to use that. They are much more likely to be using ISO 8859-1.
That being the case, you'll probably find that the byte value that represents the accented "e" is still the same on the linux machine as it was in windoze.

HTH.


Trojan.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top