Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Search within documents! Pls Help

Status
Not open for further replies.

Forri

Programmer
Oct 29, 2003
479
MT
Hi All

I'm creating a small web interfaced search engine which will search for text (given by the user) within a folder of documents.

What command should i use and how should i use it to search these documents containing certain text? Can i retrieve a list of documents and is it possible to list them in number of hits order?


Thanks in advance
Nick
 
thanks all

I know that i should use grep but i really don't know how to fix a command which reads within the doc (openoffice, text all)

Any help would be great!

Tahnsk again
Nick
 
b allows for binary files, but OpenOffice-Docs are zipped (though named .sxw) where the content is stored in a file 'content.xml'.

I tried something like that:
Code:
find ./ -name "*.sxw" -exec unzip -p {} content.xml \; | grep -c PATTERN
to find all sxw-files, extract 'content.xml' to a pipe (-p), and grep counting (-c) for a pattern.
But it doesn't work as expected.

I tried grepping inside the find:
Code:
#!/bin/bash
# docfind.sh: 
unzip -p $1 content.xml | grep -c $2
# eof

find ./ -name "*.sxw" -exec ./docfind.sh {} PATTERN \;
Perhaps someone knows how to build a solution from this fragments.





seeking a job as java-programmer in Berlin:
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top