Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

finding words within words

Status
Not open for further replies.

danny2785

Programmer
Jun 26, 2006
16
0
0
US
Does anybody know of a module that can find all the possible words within another word? Thanks in advance.
 
Like an anagram solver?

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
For example helloworld.

This contains :
1. hello
2. world
3. hell
4. low
 
You're going to have to program this yourself. First off, you'll need some type of dictionary file or module to test if something is a word. And secondly, you're just going to have to cycle through all the word permutations to test in a brute force method.

Code:
my $word = 'helloworld';
my $len = length($word);

for (my $i = 0; $i < $len; $i++) {
  for (my $j = 1; $j <= $len-$i; $j++) {
    my $subword = substr($word, $i, $j);
    if (isWord($subword)) {
      print "$subword\n";
    }
  }
}

Now, how you're going to write a isWord function, damned if I know. I'd probably start with some publically accessible dictionary file and go from there. If you need it to be more unabridged, you'll just have to get a bigger dictionary file or interface with some website.
 
should say:

then check each permutation against a dictionary file.

You may want to set a limit on the over all length of the string and not check for permutations less than 3 characters long, otherwise it could take a long time to check all the permutations of a long string. A 10 letter string has 3,628,800 permutations!

- Kevin, perl coder unexceptional!
 
you may get some relief by checking initial letters of perms, and if they don't exist, then discard the perm. Also some scope for character sequences that don't exist, at least in the english language

'xk', 'cx', ... etc etc

Lengths of words, if over a certain length, it might make more sense to load words of that length, from your dictonary, into a tree, and compare letter by letter until successful completion, or failure to match the next letter

Paul
------------------------------------
Spend an hour a week on CPAN, helps cure all known programming ailments ;-)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top