Compressing array with duplicate values

hads · Aug 8, 2002

Hi all,

I am working on a basic indexing engine for a database driven article/news type site.

I have the search bit sorted, and am working on the indexing. The article always comes from a database in plain text.

What I figured is this:

1. Strip all the punctuation and bad words from the text
2. Get all the words into an array
3. Somehow 'compress' the duplicate words in the array to one entry + the number of times it occurred.
4. Store this data in a table.

So, I'm kinda stuck on part 3.

I know about array_unique() but this doesn't help with the 'number of occurences' bit. I could walk through them and test against the database but I feel there must be a more efficient way.

Could someone either;

a) enlighten me
b) point me in the right direction
c) tell me I'm going about it completely the wrong way.

if so that would be wonderful.

Cheers [smurf]

01101000011000010110010001110011

AnakinPt · Aug 9, 2002

why you don't fill the array with the words as keys?

$array=array();
while($text=fread(...)){ // i suppose you are reading from a file or something like it.
$array[$text]++;
}

in the end you get an array where the indexes are the words and the values are the number of hits.

Anikin
Hugo Alexandre Dias
Web-Programmer
anikin_jedi@hotmail.com

hads · Aug 11, 2002

Hey Anikin,

Thanks heaps for the reply, I came up with something over the weekend using two arrays with the same keys. Not sure if it is the best way but it is working for the moment. I may look into re-writing it at some stage.

I will see if it is more efficient using your method.

Thanks again. [smurf]

01101000011000010110010001110011

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Compressing array with duplicate values

hads

Technical User

AnakinPt

Programmer

hads

Technical User

Similar threads

Part and Inventory Search

Sponsor