Web Site with No Access to Outsiders

Chinyere · Jul 9, 2002

Hello

,

I need to set up a development web site. This is a test web site and should not be available to outsiders, search engines, and the like. The web site will be viewed by members of my company and our client ONLY.

How to do this?

I am thinking that the best way to do this is by setting up a firewall. Does anyone out there have any ideas on how I should proceed?

It is very important that outsiders and search engines are not able to view this site. Thanks.

Chinyere

FesterSXS · Jul 9, 2002

can you set up a Virtual Private Network (VPN) and host it on that? Tony

http://www.phoenix-paints.co.uk/portfolio

onpnt · Jul 9, 2002

The search engine issue really isn't a issue. If you don't submit the url to be indexed intot their DB's they will not have it listed for search results
are you able to use ASP?
[bomb]

I may not get it the 1st or 2nd time,
but how sweet that 15th time can be.
admin@onpntwebdesigns.com

Wullie · Jul 9, 2002

Hi guys,

Htaccess would also be a good option to do this.

I suppose it really comes down to what you can and can't do. There is no point us running through explaining if your server does not support it.

Hope this helps Wullie

sales@freshlookdesign.co.uk

http://www.freshlookdesign.co.uk

http://www.searchit-now.co.uk

lexus · Jul 10, 2002

There is also a metatag that you can place on your files that will prevent search engine and search directories from indexing your pages. Just because you don't submit them to a search engine or direcory does NOT prevent them fromeventually becoming part of their databases.

Wullie · Jul 10, 2002

Lexus,

For a crawler to find your pages, there must first be a link to that page somewhere on the internet. A spider will never just stumble across a page that has no links pointing to it and has never been submitted to their database.

The robots meta tag and robots.txt do not stop all search engines from indexing a site. Both depend on the engine and a lot of smaller engines do not respect the robots exclusion protocol so your tags are useless.

Hope this helps Wullie

sales@freshlookdesign.co.uk

http://www.freshlookdesign.co.uk

http://www.searchit-now.co.uk

lexus · Jul 10, 2002

Wullie:
You say :"For a crawler to find your pages, there must first be a link to that page
somewhere on the internet. A spider will never just stumble across a
page that has no links pointing to it and has never been submitted to
their database."

Interesting! I had an internal two identical files on our corporate website that had NO links to anyplace and NO ONE was linked to them. One had the NOINDEX metatag and the other had no meta tags at all. The only difference in the files were the content in the title tags and the actual titles on the files.

About 6 months later, I was able to find the one with no meta tags in a search on Google and another search engine! These were a test pages for a new design I was working on and only contained text and html. No applets, scripts, or graphics were on these files. No other employee even had knowledge of the files as I am the only one with access to the web server. They were basically stripped-down files.

Any clue how the one became indexed while the other didn't? Thanks...

Lexus

Wullie · Jul 10, 2002

Hi mate,

I cannot comment on a precise issue such as this because I don't know all the facts about it.

The biggest reason that spiders find pages that nobody is linked to is that the spider encounters a log file of some sort.

The log file could either be on yours or an external site that one of your pages link to. When you click on the link, the refferer page is shown in the log files, the spider finds this log and indexes the links.

I can definatly tell you that a spider will NEVER stumble upon an unknown page that is not linked from anywhere, unless someone submits it to be indexed, this is simply not possible.

A crawler reads a page, extracts the links and may then later index them. It does not guess url's and therefor will not ever find a page that is unknown to the internet.

I have had this discussion will loads of people. Most people think that the large engines contain EVERY page that is on the internet, they don't. Even if the spider knows about a page that does not mean that it will be included in the index.

I have test areas that spiders have indexed, this was my fault. I posted the url here to show someone the page and then the spiders found the url here and followed it. I also have other test areas that have never been found by a search engine because as I said, they are not linked from anywhere.

Hope this helps Wullie

sales@freshlookdesign.co.uk

http://www.freshlookdesign.co.uk

http://www.searchit-now.co.uk

GUJUm0deL · Jul 11, 2002

Chinyere, if you have coldfusion support then do it that way.
What CF allows you to do is either only give certian pages authority to view the page, or either not give certaian pages authority to view certian pages.
Its also not that diffucult to do it that way. I can give a sample of how i'm doing it for one of my clients.
If you don't know CF then you can also do it via httacess, like wullie suggested... I have not failed; I merely found 100,000 different ways of not succeding...

xutopia · Jul 11, 2002

If you go to google.com and type this instead of the normal URL :

http://www.google.com:80/

you will still access the web site like you would normally. What you are doing is specifying the port on which you are going. By default HTTP is set to port 80.

I have two servers set up. one is set to port 80 (standard port for http) and another is set up on port 8080.

To go to the normal one I type

http://www.mysite.com

and the second I type

http://www.mysite.com:8080.

For testing this is good enough.

If it is very important that you not let anyone access this information you can use apache. Ask in a forum how to set it up with htaccess so it protects it with a password and username.

I hope this helps. Gary

http://www.xutopia.com

Haran

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Web Site with No Access to Outsiders

Chinyere

Programmer

FesterSXS

Programmer

onpnt

Programmer

Wullie

Programmer

lexus

MIS

Wullie

Programmer

lexus

MIS

Wullie

Programmer

GUJUm0deL

Programmer

xutopia

Programmer

Similar threads

Part and Inventory Search

Sponsor