hi friend,
its very simple now... all u need is that to have a database containing the columns like title, url, keyword, description and body. U program your page in such a way that it reads the links from home page and then individual link page. Then it starts filtering...it first searches for <title keword and get its position and extracts all the string till it encounters </title>
Simlilary keyword and others
In case of body make sure u remove all the tags, keywords like and of if but etc... and stores in the database.
The loop continues till all links are checked.. though u have to do lots of validation...
Hope this wil slove ur purpose...
Regards