Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Most common symbol

Status
Not open for further replies.

AlexeiD

Programmer
Nov 24, 2003
4
RU
Let's say I have a text.
I need to know what symbol of that text is the most common (appears most often).
I'm rather new to pascal, so don't know that.
I understand it's done through array probably.
Thanks for your help :)
 
If this text is stored in a file, you can read it in either line per line or character per character (if you read lines, you have to loop through the line to get each character). What you also need is an array with 256 entries of, say word or longint (this depends on the size of the text). Each entry in the array represents a counter for a certain character (e.g. entry 65 represents the character 'A').
For each character you read, you have to increase the appropriate counter. If you don't want to count characters but letters, you should convert all read characters to uppercase so that 'A' = 'a'.

A general outline of the program would be as follows:
Code:
VAR counters : array[#0..#255] of longint;
repeat
  ch:=getcharacter(f);
  inc(counters[ch]);
until eof(f);

I used the
Code:
getcharacter
function here to indicate that the next character should be provided, this can be either directly from file of from a string (when the file is read line by line). Note, however, that reading the file line by line adds a little to the complicatedness of the code, but will in theory be faster.
The reason why I declared the counters array with character indices in stead of integer indices is that, like this, it's easier to address it. If I would have used integers (
Code:
array[0..255]
) then conversion would be needed from character to byte in the count loop. Note that using characters as indices in an array is perfectly valid Turbo Pascal code.
Another remark: if you read the file line by line, you will never encounter a CRLF (carriage return/line feed), when reading in characters you do count these character combinations.
And finally, you can read faq935-3446 to check out how to use text files.

Regards,
Bert Vingerhoets
vingerhoetsbert@hotmail.com
Don't worry what people think about you. They're too busy wondering what you think about them.
 
Thanks for the really fast and useful answer :)
 
Agree with Bert's answer except about line-by-line. If you read it as a text file you are reading strings which cannot exceed 255 characters, so you will run into difficulties if your text file was generated on something without that limitation. For safety it might be better to treat it as a binary file.
This will not necessarily make things slower. Binary files are read through a buffer, and even if you were to program for a system with no buffer, you can still use blockread to read great chunks at a time.
Good luck!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top