Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Storing and Retrieving FullText 1

Status
Not open for further replies.

CFB

Programmer
Jan 11, 2001
74
US
I'm planning on storing/indexing millions of text documents and allowing them to be keyword searched and retrieved via a web page. I'm trying to figure out the best way to handle this. I know it's a broad question, but does anyone have any suggestions regarding the best way to do this?

I'm considering three different options:
1. storing flat files which I'll search based on XML schema
2. storing in a SQL Server database using full-text indexes
3. storing in an Oracle database using their XML functionality

I'm not well versed in any of these methods, so there will obviously be some learning needed on my part. But, I want to try to do this in the most efficient manner possible. I'm also expecting that there may be a completely different way that I haven't listed, that would be a better implementation. Thanks for any help or opinions that you guys can provide.
 
i dont think you need to use any database for this purpose

just use microsoft index server, and index all your pages, and you can easily search your indexed pages by using an object to index server on your webpage, you can use simple scripting

Kishore MCDBA
 
Thanks Kishore. Do you think this is faster than parsing the fulltext I'm storing for each word and indexing each word with a reference to the document it came from? I need to limit who can view/edit/update these fulltext documents based on certain privledges, so I have to use a database at some point.

I guess what I'm asking is, what's the best way to store fulltext (for example blogs), allow it to be catalogued in a database, and index it some way so it can be searched very quickly. Thanks again.

-CFB
 
Hmm...after reading a few other threads (which I should have done in the first place), I'm thinking that a data warehouse isn't what I'm actually looking for. I'll be updating/inserting the data directly, so I don't think I want to use the data warehouse structure to store the fulltext I'm collecting.
 
yup you dont need any datawarehouse concepts, all you are looking for is the document management, to set the privilizes you need to use database, and for indexing purpose just use Microsoft index service, using index service you can even generate a word index. so your work will be done quickly, for further information about index services you better check the help or go to any ASP forum.

Kishore MCDBA
 
thanks kishore, but can you tell me is it possible to do it with php instead of ASP? do you have any sample code?
 
yes it is possible to do with php, PHP and ASP both are same, but different providers, but the concept is same, both are used to created serverside scripting which redirects HTML code to client.

Kishore MCDBA
 
well DWMaster, i am sorry i am not sure whether you can use microsoft indexing service object with PHP or not,

Kishore MCDBA
 
You may also try with Oracle Text feature with ora9i which allows to store text data and also link to an external file containing text data.

thanks
dwpro.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top