Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

KEYWORD SEARCH IS OVERLOADING THE SERVER!

Status
Not open for further replies.

harmonyz

IS-IT--Management
Nov 21, 2001
1
HK
am running CF Server 4.5 with IIS 5 on Win 2000 Server, and with SQL 2000 D/B.

I have a text field called TEXT_TO_SEARCH containing text - entered from a form on one of my .cfm pages, approx 2000-4000 characters each - in one of the tables, called MAINDATA in my SQL 2000 database .

I need to search the TEXT_TO_SEARCH field in the MAINDATA table for the presence of certain keywords and the number of occurences of each keyword. I have a table, called KEYWORDLIST that contains these keywords, approx 2500 in number.

I want to run this process as a daily schedule, so for example if I have 100 rows added to my MAINDATA table in the day, I need to search 100 TEXT_TO_SEARCH fields, each approx 2000-4000 characters long for the presence and no of occurences of 2500 keywords at a time.

The catch is that I only want to match complete words, therefore, if the TEXT_TO_SEARCH
field is :

'For information about offline browsing with Internet Explorer, click the Help menu, and then click Contents and Info'

and one of my keywords is 'the', then it should only match the word 'the' and not the word 'then' in the TEXT_TO_SEARCH field.

Now when I run this schedule on my development environment, it takes 24 minutes for only 15 rows of my MAINDATA table to pick out all the keywords and the number of occurences !! What's more during those 24 minutes, my CPU utilization goes up to 100 %.

I'm using the FindNoCase function to match the keywords.

My .cfm page code structure looks like this :

begin loop 1 : get TEXT_TO_SEARCH field from MAINDATA (database query)

begin loop 2 : get keyword from KEYWORDLIST (database query)

if Findnocase(keyword) is not 0
begin loop 3 : find next occurence of keyword until the end of TEXT_TO_SEARCH

end loop 3

end loop 2
end loop 1

So for 15 rows in my MAINDATA table and 2500 keywords, this code is performing at least 15 * 2500 = 37500 findnocase operations, probably more if the keyword occurs more than once.

Is this what's causing the process to run so slowly ? Does anybody have any suggestions as to how I can make this loop structure more efficient in terms of time taken and memory consumed ?
Or can anybody suggest any custom tags or third party software that can do the same thing for me in a more efficient manner ?

Any help would be greatly appreciated.
 
Maybe you can fit Verity into your task, Allaires fulltextengine incorporated into ColdFusion.
Chapter 3/Configuring ColdFusion Server/Indexing Data with Verity

HTH
reimer
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top