Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Automate keyword search 1

Status
Not open for further replies.

AHJ1

Programmer
Oct 30, 2007
69
US
I need to search a number of bills that have been passed by the Nevada legislature for the occurrence of keywords.

The bills are posted on a public website in .pdf format.

Can anyone provide any advice on the best way to automate the search i.e.
1) Open a web page such as
2) Loop through a table that has the keywords, using either Adobe Reader or Adobe Acrobat. (Does Reader have an object library that can be manipulated?)

* If a keyword is found then display that key word in context.
* Allow for manual searching for the next occurrence of the keyword
* When there are no more occurrences of that keyword, go to the next keyword on the list.

3) Advise when all keywords have been searched.

Thanks,
Alan
 
To be honest I'd store your keywords in a database then automate a Google search against those public PDFs, dynamically building a search string against the file(s) you want and run it in your browser.

John

 
John,

That's an interesting idea. Could you point me towards a code snippet that might serve as a guide?

Alan
 
I haven't got any sample code to hand to do this, its an idea that came into my head as I read your posting. However, you need to build up a string variable similar to:

keyword+inurl%3Awww.somewebsite.com%2Ffolder%2Ffilename.pdf&meta=

The bold parts are the bits that need changing.

You can then run this with
Application.FollowHyperlink yourvariable.

The %2F in the string is the ascii code for the / character.
The "folder" needs to be the folder name under the main web site (for example - this is the /folder/

Of course, you are limited in that Google can only show what it has indexed, but I hope that this gives you an idea.

You can store your keywords in one table and web links in another table, then use a query without joins to get the cartesian product (ie one row for every keyword in every link) to run them.
Loop through this with a recordset object to automate the generating of the code and open it in your browser when you need.

If you use the Web browser control you should be able to retrieve the search results and push them into your db in another table to save rerunning queries to look at your results.

John
 
Here's a very basic function to run this for you.

Create one table as mentioned above, tblKeywords and list in a column called keyword the keywords you want to search for.

Create another table called tblPages with a field webpage. In that, put the URL of the web page or PDF you want to search for (eg
Create a query called qryKeywords as above - ie make it the cartesian product of the two. If this isn't exactly what you want, you can add extra filtering in so that not all keywords get applied to all files.

Ensure you are connected to the internet with any necessary information (proxy servers etc) and your browser is not set to work offline.

Then, run the function.

Code:
Function QueryGoogle()

    Const QUERY_NAME = "qryKeywords" ' Query that generates the data (as a cartesian product of tblKeywords and tblPages
    
    Dim cn As ADODB.Connection
    Dim rst As ADODB.Recordset
    
    Dim strSQL As String
    
    Set cn = CurrentProject.Connection
    
    Set rst = cn.Execute("qryKeywords")
    
    
    Do While Not rst.EOF
        strSQL = "[URL unfurl="true"]http://www.google.co.uk/search?q="[/URL] & chr$ (34) & rst!keyword & chr$ (34) & "&inurl:[URL unfurl="true"]http://"[/URL] & rst!webpage & "&meta="
        Application.FollowHyperlink strSQL, newWindow:=True, addHistory:=False
        
        rst.MoveNext
    Loop

    rst.Close
    cn.Close

End Function

There's a lot that could be done to tidy it up, but its a basic idea. To learn about writing advanced queries with Google, look at
I have used the Google UK site as that is where I am based, it is probably advisable to use your geographically closest site.

John
 
Thank you very much for the suggestion, and for the follow up. It's beyond the call of duty.

Alan
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top