I want to check text against a custom dictionary. There's a lot of free word lists available for download and I have a particularly long one --109,583 words. To compare every word in a given chunk of text with every word in the dictionary is prohibitively too CPU or disk I/O intensive.
My plan is to store and load the word list as HEX (PHP bin2hex function?), binary or possibly floating point data*, convert the chunk of text to the same and do a faster numeric comparison. I'm pretty sure most spell checkers do binary comparisons against dictionaries.
* - Floating point registers are the fastest.
My plan is to store and load the word list as HEX (PHP bin2hex function?), binary or possibly floating point data*, convert the chunk of text to the same and do a faster numeric comparison. I'm pretty sure most spell checkers do binary comparisons against dictionaries.
* - Floating point registers are the fastest.