Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to get a list of all images on the live site?

Status
Not open for further replies.

imstillatwork

IS-IT--Management
Sep 26, 2001
1,605
US
I am working on a pre-existing site (dotnetnuke...blech)

the people running it had images all over in the directories, just placed at random, and them left there even if they are not in any html anymore.

Is there a tool that can crawl the site, make a list of images (with path) so I can compare that to the ftp listings and start deleteing ? Thanks!

There is too many images, and to many folders to do it manually, it would take DAYS, and this is just a side job where I work.

Kevin
 
Hi

You forget to mention you operating system. Some time ago I done the same thing this way :
[ul]
[li]put [tt]wget[/tt] to mirror the site with -nv option to get a cleaner output[/li]
[li]with [tt]sed[/tt] I cuted out the URLs of images and transformed to relative pathes[/li]
[li]then [tt]sort[/tt]ed the list[/li]
[li]with the [tt]find[/tt] command I get a list of all image files in the DocumentRoot[/li]
[li]then run [tt]sort[/tt] on the second list too[/li]
[li]from the two file lists the [tt]diff[/tt] command creates a list of differences[/li]
[/ul]
By the way, all above works on Linux, but the mentioned tools have version ported to Windows too.

Feherke.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top