Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Determining if a file is Ascii

Status
Not open for further replies.

riderman34

Programmer
May 14, 2001
1
US
Can someone please help? I am writing a program that reads a file and then moves that information to various tables in our application. Before I do anything with the file I want to determine if it is in Ascii format. If it is then I want to process if it is not then I do not. Is there a C function or a windows call that will help me to determine what the kind of file is being used? I have already ruled out looking at the files extension; because Ascii can be many different file types. Can anyone help? thanks
 
use isascii() for every char in the file.It will take a while, but it will solve the problem.
 
to check the file is in ASCII format, you could always open it directly in a text editor. The operating system would then alert you if the file is in binary.

Hoping to get certified..in C programming.
 
Hi,
I think, You can use fopen() function .. which is use for opening text file. If the file is not in ascii format, it will not open the file and will return error status .. i guess so.

regards,
Mahesh
 
If you are trying to determine the file contents under (Li)(U)nix, try the 'file' command. This program reports on the contents of a file as 'ascii text', 'ELF executable', 'data', etc. When it comes to 'ascii text', it is about 95% accurate because if you have a file that is mostly ascii characters but contains some null characters, it will report the file to be 'ascii text' when it should be 'data'. The only non-printable characters that should be in a pure 'ascii text' file are the line-feed (0x0a), carriage-return(0x0d), and if the file is generated as a printer compatible file, you may find form-feed(0x0c) characters. There is another character, tab(0x09), that can be in a text editor acceptable file, but unless the program reading the file is capable of expanding tabs to spaces, the data you are trying to pull from the ascii file may not align properly.

If you want to emulate the 'file' command under Windoze, write a small 'c' program that reads the first 4096 (+/-) characters or so and examine the data. If you find any non-printable characters other than line-feed(0x0a) or carriage-return(0x0d), I would reject the file as a candidate for 'pure-ascii' status, especially since you intend to use your file as data input to another system which may require the field data to be properly aligned.

BTW, how much data you read in the 'c' program is up to you. It needs to be enough to be a statistically significant representation of the target data.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top