Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Word Descrambler

Status
Not open for further replies.

bdmangum

Technical User
Dec 6, 2006
171
US
Hey guys,

The other day I was playing around and thought it might be fun to try and build a word descrambler in Excel using VBA. Most places online tend to use C++ or some scripting language to run this sort of code.

My thought was to store the list of possible words in a spreadsheet. Below is the code I currently have written. I'm somewhat stuck as to how to write the algorithm to work for an input word of any length. If I set a limit on the length of the word, the code is simple. But how can it work with an unknown word length? Any thoughts?

Note that some of the code is simply testing the command within the intented context (ie the checkspelling command) and not the actual implementation of the command. My code is in the very early stages.

Code:
Sub WordGen()

Dim strWord As String, lenWord As Integer, count As Integer, isWord As Boolean
Dim lArray()

'Set variables
    strWord = InputBox("Input word or letters to be descrambled.", "Word Descrambler")
    lenWord = Len(strWord)
    ReDim lArray(lenWord - 1)
    ThisWorkbook.Sheets(1).Range("A2").Value = strWord
    
'Loop through letters to fill array
    For i = 0 To lenWord - 1
        lArray(i) = LCase(Right(Left(strWord, i + 1), 1))
    Next
    
    iLet = LCase(Right(Left(strWord, i), 1))
    
    Application.DisplayAlerts = False
    isWord = ThisWorkbook.Sheets(1).Range("A2").CheckSpelling
    Application.DisplayAlerts = True
        
End Sub

Thanks,
BD
 
Here is some code I put together. It seems to run pretty fast for words up to length of 9 letters. The biggest difference in speed seems to be removing the Application.CheckSpelling and replacing it with a dictionary and using the Dictionary.Exists approach.

If any has any ideas to speed it up, chime in!

Note: You need to download the dictionary file I modified from the link. Make sure to update the filepath in the CreateDictionary subroutine once you download the file.


Code:
Public MyDictionary As Object

Sub Permutations()
' Select a string, then run the macro
Dim strTest As String
Dim strOut, strVal, strChk As Integer, strTime As Date
Dim i As Long, j As Long, pos As Long, k As Long, l As Long, m As Long

ThisWorkbook.Sheets(1).Range("B3").value = "Filler"
ThisWorkbook.Sheets(1).Range(Cells(2, 2), Range("B2").End(xlDown)).ClearContents

strTest = LCase(InputBox("Input word or letters to be descrambled.", "Word Descrambler"))
ThisWorkbook.Sheets(1).Range("A2").value = strTest

strTime = Time
strOut = Mid(strTest, 1, 1)
strOut = StringString(2, strOut & " ")
strOut = RTrim(strOut)
strChk = False

CreateDictionary    'builds dictionary

For j = 2 To Len(strTest)
    strOut = Split(strOut, " ")
    For i = 0 To UBound(strOut)
        pos = i Mod (2 * j)
        pos = pos - (j - 1)
        Select Case pos
        Case Is <= 0
            pos = Abs(pos)
        Case Else
            pos = pos - 1
        End Select
        strOut(i) = InsertString((strOut(i)), _
        Mid(strTest, j, 1), pos)
    Next i
    If j < Len(strTest) Then
        For i = 0 To UBound(strOut)
        strOut(i) = _
          StringString(j + 1, strOut(i) & " ")
        Next i
    End If
    strOut = Join(strOut, " ")
    strOut = Replace(strOut, "  ", " ")
    strOut = RTrim(strOut)
Next j
strOut = Split(strOut, " ")
For l = 2 To Len(strTest)
If l = 2 Then loopit = Time
    For k = 0 To UBound(strOut)
        If MyDictionary.Exists(Left(strOut(k), l)) = True Then
            If strVal <> "" Then
                strChk = InStr(1, strVal, Left(strOut(k), l))
                If strChk <> 0 Or strChk <> Null Then
                    If Right(Left(strVal, strChk + Len(Left(strOut(k), l))), 1) <> "," Then
                        strVal = strVal + Left(strOut(k), l) + ", "
                    End If
                Else
                    strVal = strVal + Left(strOut(k), l) + ", "
                End If
            Else
                strVal = Left(strOut(k), l) + ", "
            End If
        End If
    Next k
Next l
strVal = Left(strVal, Len(strVal) - 2)
ThisWorkbook.Sheets(1).Range("F1").value = Time - strTime
ThisWorkbook.Sheets(1).Range("F2").value = UBound(strOut) + 1
strVal = Split(strVal, ", ")
For i = 0 To UBound(strVal)
    ThisWorkbook.Sheets(1).Cells(i + 2, 2).value = strVal(i)
Next i
strVal = Join(strVal, ", ")
strVal = Replace(strVal, "  ", " ")
strVal = RTrim(strVal)

MsgBox "Valid words created from given letters are listed below:" & vbLf & vbLf & strVal & vbLf & vbLf & "Macro run time: " & ThisWorkbook.Sheets(1).Range("F1").Text

End Sub

Function InsertString(strOld As String, strInsert As String, pos As Long) As String
' Inserts strInsert into strOld at position pos
InsertString = Left(strOld, pos) & strInsert & Mid(strOld, pos + 1)
End Function

Function StringString(imax As Long, myString As String) As String
' Comparable to String(). Creates string of imax times the string myString
Dim i As Long
If imax < 0 Then MsgBox "imax < 0", vbCritical
i = 0
StringString = ""
Do While i < imax
    Select Case i
    Case 0, Is >= imax / 2
        StringString = StringString & myString
        i = i + 1
    Case Else
        StringString = StringString & StringString
        i = 2 * i
    End Select
Loop
End Function

Sub CreateDictionary()

Dim intInput As Integer, strBuffer As String, FileName As String, count As Long
Set MyDictionary = CreateObject("Scripting.Dictionary")

FileName = "C:\Documents and Settings\bmangum\My Documents\English_Dictionary.dic"

'Open the dictionary file
intInput = FreeFile
Open FileName For Input As #intInput
count = 1

Do
    'Get a line from the temporary file
    On Error Resume Next
    Line Input #intInput, strBuffer
    MyDictionary.Add strBuffer, strBuffer

Loop Until EOF(intInput)

End Sub
 
You might want to see what results it produces for 'descrambler' ...

(and that dictionary is tiny)
 
Interesting; I, too, play Scrabble and have even been in a couple of competitions. I understand that you are not trying to recreate Scrabble, but the Scrabble dictionary has tons of words that are not contained in a regular dictionary. These are real words and are contained in an unabridged dictionary, but not a college dictionary, as so many people own.

How would you account for that?

Just a thought.



Ron Repp

If gray hair is a sign of wisdom, then I'm a genius.

My newest novel: Wooden Warriors
 
The dictionary I used is very small. I'm still trying to find a better, complete dictionary. Is there a SOWPODS complete word list available online?

I recently grabbed a few more dictionary lists from the web. I think I will simply build a macro which will combine all the dictionaries I find into one dictionary. Maybe once I add them all together and include SOWPODS the dictionary will be a decent dictionary.

Anyone have ideas on how to further increase the speed of the posted macro? I knnow it works well for up to nine letters, but perhaps we can make it go even further.
 
Yep, think I have code that is substantially faster, using the approach I described earlier. Just need to tidy it up a little ...

Takes about 15 seconds to start up in order to load the full SOWPODS word list on my 2.8GHz box (single core). Once that is done it hardly delays at all. e.g will find all the anagrams and sub-anagrams of an 13 letter word such as traditionally in about 0.25s
 
Ok, here's my version. It is basically a bare bones implementation, although I've tried to make some of it map onto functions in bdmangum's solution. It also assumes that you have the SOWPODS word list:
Code:
[blue]Option Explicit

Private WordList As Dictionary

Public Sub Permutations()
    Dim strWord As String
    Dim strVal As Variant
    Dim i As Long
    
    If WordList Is Nothing Then LoadSOWPODS
    strWord = InputBox("Word:")
    'Debug.Print GetAllAnagrams2(strWord)
    
    strVal = Split(GetAllAnagrams2(strWord), " ")
    Columns("A:A").ClearContents
    For i = 0 To UBound(strVal)
        ThisWorkbook.Sheets(1).Cells(i + 1, 1).Value = strVal(i)
    Next i
End Sub


Private Sub LoadSOWPODS()
        ' Load SOWPODS and convert to indexed anagram dictionary
    ' This is the only time-consuming part of the program, as it is
    ' loading and indexing about 300000 words
    Dim vString As Variant
    Set WordList = New Dictionary
    With New FileSystemObject
        For Each vString In Split(.OpenTextFile("c:\sowpods.txt").ReadAll, vbCrLf)
            ListAdd CStr(vString)
        Next
    End With
End Sub

Private Sub ListAdd(strWord As String)
    Dim strSerial As String
    strSerial = Serialise(strWord)
    ' Word is either an anagram of a word already
    ' in the list or it is a brand new word
    If WordList.Exists(strSerial) Then
        WordList.Item(strSerial) = WordList.Item(strSerial) & " " & strWord
    Else
        WordList.Add strSerial, strWord
    End If
End Sub

' sorts letters into alpha order
Private Function Serialise(aString As String) As String
    Dim b() As Byte
    If aString <> "" Then
        b = StrConv(aString, vbFromUnicode)
        InsertionSortByte b
        Serialise = StrConv(b, vbUnicode)
    End If
End Function

' Simple Insertion Sort routine for sorting small byte array
Private Sub InsertionSortByte(CharArray() As Byte)
    Dim Char  As Byte
    Dim lp1 As Long
    Dim lp2 As Long
    
    For lp1 = LBound(CharArray) + 1 To UBound(CharArray)
        Char = CharArray(lp1)
        For lp2 = lp1 - 1 To LBound(CharArray) Step -1
                If CharArray(lp2) < Char Then
                    Exit For
                End If
                CharArray(lp2 + 1) = CharArray(lp2)
        Next lp2
        CharArray(lp2 + 1) = Char
    Next
End Sub

Private Function GetAllAnagrams2(strWord As String) As String
    ' Now we kind of reverse things to find all the anagrams our word contains
    ' We take each dictionary key (which is an alpha order string that indexes all anagrams of that string)
    ' and we check to see if all its letters are in our source word (this means we don't have to
    ' generate all the permutations of the source word and its sub-words)
    
    Dim vItem As Variant
    Dim lp As Long
    Dim found As Long
    
    strWord = Serialise(UCase(strWord))
    For Each vItem In WordList.Keys
        If Len(vItem) <= Len(strWord) Then
            found = 0
            For lp = 1 To Len(vItem)
                found = InStr(found + 1, strWord, Mid$(vItem, lp, 1))
                If found = 0 Then Exit For ' as soon as a letter doesn't match then fail
            Next
            If found > 0 Then ' if we didn't fail, then we have the key to anagrams contained in our source word
                If GetAllAnagrams2 = "" Then
                        GetAllAnagrams2 = WordList.Item(vItem)
                    Else
                        GetAllAnagrams2 = GetAllAnagrams2 & " " & WordList.Item(vItem)
                    End If
            End If
        End If
    Next
End Function
[/blue]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top