Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Disable search engine indexing 2

Status
Not open for further replies.

Glasgow

IS-IT--Management
Jul 30, 2001
1,669
GB
I'm not sure if this is the forum to ask the question but is there a way of preventing a domain from being indexed by search engines?

Also, what if I have two domains pointing to the same site, can I disable one and not the other?

Thanks in advance.
 
Hi

Glasgow said:
I'm not sure if this is the forum to ask the question but is there a way of preventing a domain from being indexed by search engines?
By default this is not a HTML question, but can be solved with HTML too.
Either create a file in the document root :
Code:
User-agent: *
Disallow: /
Or add [tt]meta[/tt] tags to all documents :
HTML:
<meta name="robots" content="noindex,nofollow">
See robotstxt.org for detailed documentation.
Glasgow said:
Also, what if I have two domains pointing to the same site, can I disable one and not the other?
That is bad. Serve the content only for one domain, redirect requests for the other domain(s) to that one. Search forum828, this was already discussed a few times.

Feherke.
 
Thank you for the reply - that is useful information / advice.

With regard to your remark:
That is bad. Serve the content only for one domain, redirect requests for the other domain(s) to that one.
Are you effectively suggesting that, while cost savings might be made by pointing two domains at the same site, this is a false economy and I would be better to pay for two sites where each page in one site effectively redirects to the equivalent page in another?

This leads to some additional questions:

a) If I redirect does this not mean that the URL in the browser's address bar will change to reflect the site that I have been redirected to? It is important to me that the user appears to be within the site they first arrived at.
b) If so, might I overcome this problem by using a #include instead of redirecting?
c) Am I right in thinking that your recommended approach would involve replicating every page within the site (whereby each contains the appropriate redirect / #include)?

Thanks again.
 
Hi

Glasgow said:
Are you effectively suggesting that, while cost savings might be made by pointing two domains at the same site, this is a false economy and I would be better to pay for two sites where each page in one site effectively redirects to the equivalent page in another?
No. The search engines tries to list relevant content. Duplicated content is not relevant. When they detect duplicated content they pick one domain and list only that. The other domains are listed as duplicates only on request. There is no way to influence which domain is the "original".

Read thread828-1373776, especially ChrisHirst's reply. Feel free to search for more similar threads.
Glasgow said:
a) If I redirect does this not mean that the URL in the browser's address bar will change to reflect the site that I have been redirected to? It is important to me that the user appears to be within the site they first arrived at.
Yes, that is the point of the redirection.
Glasgow said:
b) If so, might I overcome this problem by using a #include instead of redirecting?
No. What you are doing on the server to generate the documents is not perceivable from outside.
Glasgow said:
c) Am I right in thinking that your recommended approach would involve replicating every page within the site (whereby each contains the appropriate redirect / #include)?
No. There should be no replications, only on setting in your web server.


Feherke.
 
Are you effectively suggesting that, while cost savings might be made by pointing two domains at the same site, this is a false economy and I would be better to pay for two sites where each page in one site effectively redirects to the equivalent page in another?
You could do it that way, but it's a longwinded and expensive approach. The easy way is to point both domains at one set of hosted content, but to check (and maybe change) the URL that they come in on. If you're hosted on unix/linux, you'd do this with a .htaccess file and mod_rewrite. If you're on windows, I'm sure there's a way to do it with IIS - but don't ask me how!

For example, something like this (not tested!) added to your .htaccess should do the trick if you're on unix:
Code:
RewriteCond %{HTTP_HOST}   !^[URL unfurl="true"]www\.otherdomain\.com[/URL] [NC]
RewriteRule ^/(.*)         [URL unfurl="true"]http://www.maindomain.com/$1[/URL] [L,R=301]


This leads to some additional questions:

a) If I redirect does this not mean that the URL in the browser's address bar will change to reflect the site that I have been redirected to? It is important to me that the user appears to be within the site they first arrived at.
It does. Why is it important to you?, and are you sure it's important to your users? I doubt if most of them will notice.
b) If so, might I overcome this problem by using a #include instead of redirecting?
Erm... I can't think how.
c) Am I right in thinking that your recommended approach would involve replicating every page within the site (whereby each contains the appropriate redirect / #include)?
No. See my answer above.


-- Chris Hunt
Webmaster & Tragedian
Extra Connections Ltd
 
Thanks for the replies.

Our customer "John Smith" has a web site (say There are third parties who want to sell his products on-line. They will earn commission by doing so. The third parties would like to hide the fact that they are connecting to when their customers make a purchase. Instead they want a domain that appears to be theirs. So if their own site is they would prefer that the site they link through to is, e.g.,
Other third parties may jump on the bandwagon so we would have multiple domains pointing to the same site.

Hence retaining the "correct" url on display is important although I do agree that the vast majority won't notice. Mr Bloggs will perceive it as important! Mr Smith, on the other hand will be concerned that, if the new third party sites starts to compete with his own in Google (even though he hosts it), he will have to pay commission on the sales that come via that route so he would want to eliminate the possibility of such competition.

I don't really want to duplicate the pages and redirection does not give me what I want so I was thinking that I could place the "proper" pages in one folder on the (Windows)server (perhaps one that is hidden) and, in the third party sites (and perhaps even the master site), each page would have a #include to pull the actual page content across from the file in the "hidden" folder.

I hope that makes some sense.
 
Hmmm... well, the only way I see of going forward is - building on ferherke's suggestions - send "Don't index this" instructions to pages served under the other domain.

It should be possible to use some-windows-equivalent-of-mod_rewrite to send different versions of robots.txt depending on which domain the request comes in on. Maybe somebody on forum41 could help you.

Alternatively, if you're building each page dynamically using, say, asp, or you can use server side includes, you could write an include which could check the requesting domain name and return a [tt]<meta name="robots" content="noindex,nofollow">[/tt] if it's not the preferred domain.

That should exclude the secondary domain(s) from search results, but will also mean that any links to the secondary domain(s) won't pass anything on to the main one.

If you were dealing with subdomains instead of fully fledged domains - e.g. bloggsmotors.johnsmithcarstuff.com - you could use the new canonical link tag to specify the main page. See for details, but note that it only works within the same domain.


-- Chris Hunt
Webmaster & Tragedian
Extra Connections Ltd
 
Thanks again.

I guess if I end up with two sites on the same server, I would have a separate robots.txt for each anyway so that problem would be solved.

And yes, server side includes or ASP could be used to block individual pages if necessary.

Another issue I've just thought of is that of security. I assume I will need a separate SSL certificate for each domain even if they point at the same site.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top