Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to determine whether a file is ascii or binary

Status
Not open for further replies.

tonykent

IS-IT--Management
Jun 13, 2002
251
GB
Happy New Year to you all! For my sins I'm a configuration manager and am currently working on getting a large application that's spread over a number of DEC servers into my configuration database. I've wrtten a number of perl scripts to sort the thousands of files and now have them neatly arranged ready to be input, but have realised I've another problem.

My software imports the files as binary or ascii depending upon their extension, but many of these files do not have any extension in their name, so I'm going to have to sort them into binary or ascii prior to importing. So to the problem - can anyone advise me how to detect whether a file is ascii or binary using perl?
 
Well, every file is binary. The only difference when copying files between systems is that an "ascii" file needs to have the newlines converted to the default recognized by the new system (\n,\r,\f, etc). Given a file of unknown content, I suppose you could test for the existance of certain characters to give away whether it is ascii or not (such as those not in the ascii character set).

Or else just do the conversion yourself. Unix machines usually use \n to signify a newline. Windows machines use \r\n. So, if you were converting from Windows to Unix, just read in the file to a string and do a replace (eg:
Code:
$string =~ /\r\n/\n/gis;
). This assumes that non-ascii files will not have the combination \r\n in them.
Sincerely,

Tom Anderson
CEO, Order amid Chaos, Inc.
 
on an unix system: /bin/file filename
will answers you. -----------
when they don't ask you anymore, where they are come from, and they don't tell you anymore, where they go ... you'r getting older !
 
Thanks for the ideas. The way I'm going to do it is to use:

find -f textfile > filetypes.txt

Where the textfile contains a list of all the files I'm interested in. This does indeed output a list of the files and their types. I can use this as input to a script which copies the files to different locations depending upon whether they are ascii or binary. I can then import them into my configuration database in 2 steps, one with ascii as default and one with binary as default.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top