Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

traverse and download 1

Status
Not open for further replies.

Mag0007

MIS
Feb 15, 2005
829
US
is it possible to give a URL to a script, and it will pick out the link for a ".doc" or ".xls" and download it?

For example.

% find_file ".doc .xls"
and it will goto foo_docs_1.htm and look for .doc and .xls files and download it for me?


thanks in advance
 
I'd write a script based on wget. Depending on how easy it is to grab the URLs needed to download the documents it would not be that hard.

first use wget to put the origina html document in a unix file and see what you need to awk.grep out the URLs for the documents, then use wget to doanload the documents one by one.



HTH,

p5wizard
 
Hi

[tt]wget[/tt] with the -A or --accept option :
Code:
wget -r -p -A.doc,.xls [URL unfurl="true"]http://local_site/foo_docs_1.htm[/URL]

[tt]lynx[/tt] with the -dump option, then [tt]wget[/tt] :
Code:
lynx -dump [URL unfurl="true"]http://local_site/foo_docs_1.htm[/URL] | \
sed '1,/References/d; s/.* //' | \
grep -E "\.(doc|xls)$" | \
wget -i -

The second may look strange, but sometimes that was what I need.

Feherke.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top