How does Google search the forum messages inside TEK-TIPS ? 4

Deadline · Feb 14, 2002

Hi,

This Question is not exactly about ASP, but about a technique.

How does Google search the forum messages inside TEK-TIPS ?

When you search for a topic using Google, it fetches all the relevant posts inside/within TEK-TIPS too.

I always thought that the TEK-TIPS use database to store the messages that we post. If that being the case, how Google is able to dig inside that database ?

Does that imply that Google not only searches just plain HTML files ? How do you think TEK-TIPS is structured to facilitate this ?

Can you explain please ? Thank you...
RR

Salibas · Feb 15, 2002

Google uses a technique called WebSpider. It actually "listenes" to web trafic on the internet, irrelevent to where the data is comming from. When an internet user makes a request from a server, google intercepts the data without affecting it, get the data it wants. That is why it is advisable to use META tags in the HTML code so to help webcrowlers to get exact information about your Webpage. Like the "description" and "keyword" META tags.

a small expample is the NetWatcher that usually comes with Microsoft windows, where you can see the packets sent from who to who, and some netwatchers can see what there are in a packet.

other thing that webspiders is effected from is the number of hits a page gets. This ranks a page in a search higher than other.

Salibas

http://www25.brinkster.com/salibas

Deadline · Feb 15, 2002

Thank you Salibas.

But when I "View Source" this page right now, this particular page doesn't have any kind of meta-tag whatsoever.

Despite this, the page will definitely be brought up in a Google search if searched appropriately.

This is what confuses me, how Google searches dynamically generated pages in ASP, CF etc.

What do you think ? Thank you...
RR

link9 · Feb 15, 2002

The only way that a search engine like Google (and they all do it differently) can search a site is if the pages are stored somewhere with static content...

like salibas said, they use what's referred to as a spider, and that spider just crawls the pages gleening their information, and then indexing it.

If a site that's served from a database is indexed on a search engine (i.e. this site) then those pages are sitting somewhere on the server, thereby becoming indexable. Now maybe the site developer has made alot of special considerations to get listed, and puts those pages out there for searching purposes only, and then serves the regular pages right from the database... or maybe there's a mirrored site somewhere that's getting indexed... or maybe... who knows, really, and I doubt there'll ever be a straight answer in this vein since I'm sure ALOT of work goes into making it work properly.

As far as getting listed on a search engine... people pay big bucks for "secrets", and others claim to have the "silver bullet" that will get you on all the search engines, but the truth of the matter is that there is no big secret way to get on. Lots of things contribute to your getting listed toward the top of a search engine:

meta tags on some search engines
actual page content on some search engines
getting linked to by many other sites usually helps

the list goes on and on, and being successful in this area is going to take alot of work, and a combination of what I've listed there, and probably about a hundred other things I have no idea about.

hope that helps!

paul

tsdragon · Feb 15, 2002

Actually, I doubt they Google has anything that actually "watches" web traffic. That would be an ENORMOUS invasion of privacy. However, it's also not necessary. It's very easy to write a program that fetches a web page and parses it. Then it can take any links it finds within that page and fetch them as well, and repeat the process ad nauseum. This is the technique that's commonly called "webcrawling". It works just as well with dynamically content as it does with static content. Once you fetch the page, you get the exact same html as a browser would, whether the html was static or dynamically generated. I could write a program to do it in perl rather easily. In fact, I've use the technique to have a cgi program on one web server "call" a program on an entirely different web server. Tracy Dryden
tracy@bydisn.com

http://www.bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard.

link9 · Feb 16, 2002

I didn't mean watching the traffic... only that your site appears on many other sites as links. And that can help your rating on search engines.

chiph · Feb 18, 2002

Actually, I doubt they Google has anything that actually "watches" web traffic. That would be an ENORMOUS invasion of privacy.

As Comcast has found out:

http://www.cnn.com/2002/TECH/industry/02/13/internet.privacy.ap/index.html

Chip H.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

How does Google search the forum messages inside TEK-TIPS ? 4

Deadline

Programmer

Salibas

Programmer

Deadline

Programmer

link9

Programmer

tsdragon

Programmer

link9

Programmer

chiph

Programmer

Similar threads

Part and Inventory Search

Sponsor