Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Text and links not visible in source code

Status
Not open for further replies.

Foamcow

Programmer
Nov 14, 2002
6,092
GB
I was using a web based "Spider Simulator" to look at a few sites.
I checked out the site of a company I used to work for and noticed something that I can't explain.
The easiest way to demonstrate it is to give you guys the links..

Here is the simulator:

Point it at:

Now, look at the spidered text. There are a load of words in there that don't appear on the page OR in the source code.

Same with the spidered links. Lots of links to sites presumably built by the same company. Again though, they are invisible and do not appear in the source code.

Logic says they MUST be somewhere in the source. I checked for a hidden frame, I checked the Javascript... but I can't find them.

Now it is entirely possible that I missed them but can anyone shed any light on this?

To my mind it is something that is outside the "rules" laid out by the Search Engines. What are your thoughts?

 
My guess is that they're looking the HTTP request and sending different pages depending on whether you're a spider or a real person. In particular, it presumably sends a text-based navigation instead of the Javascript driven one.

It seems like a lot of extra work to me, and something the SEs are unlikely to approve of if they find out.

-- Chris Hunt
 
Yep, I noticed they have moved the site from the server where I used to host it.

They've basically taken the site I built and partially rebuilt it in their "CMS". My guess from looking at their other sites is that they are all interlinked in the same way.

 
they are even a little blatent about it on one page take a look at the 2nd result Google Site: search

looks like they are using IP delivery or a cookie test almost undetectable unless you check specifically for it. Google Cache
The technique will certainly get them banned once it's found.







Chris.

Indifference will be the downfall of mankind, but who cares?
A website that proves the cobblers kids adage.
Nightclub counting systems

So long, and thanks for all the fish.
 
Yep I can see it now.
I used Sam Spade to get the HTTP and "pretend" to be InfoSeek and I get code back that contains links to the unrelated sites.

So in a case such as this, who gets penalised. The site engineers or the company who's sites are being linked to?

 
the site will get banned outright and sites banned for cloaking never seem to recover.

The linked sites will only seem like they are penalised when the link farm disappears.



Chris.

Indifference will be the downfall of mankind, but who cares?
A website that proves the cobblers kids adage.
Nightclub counting systems

So long, and thanks for all the fish.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top