Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations sizbut on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Robots using my contact forms!!! 1

Status
Not open for further replies.

Sleidia

Technical User
May 4, 2001
1,284
FR

I've noticed recently that what seems to be robots or worms or whatever send me spam via my contact forms on my websites !

The first time, I thought it was a spammer who used the form manually but it happened so many times that I ended doing a research on google with "baiduicp@hotmail.com" which made me understand that I wasn't alone to be persecuted ;)

So, programatically speaking, what should I do to prevent such a thing to happen again?

Thanks !
 
A spammer will not download you form, only send data directly to your script. If the timer is never started, then you can throw out the input.

It might work. It would fool the current 'bots. But keep in mind that you're implementing obscurity, not security. This lockout will be very easy to break.






Want the best answers? Ask the best questions! TANSTAAFL!
 
...But keep in mind that you're implementing obscurity, not security...

This is interesting.
Maybe I'm wrong but I believe that spammers/hackers try to harm the maximum amount of individuals. So, they try to defeat the strategy that is used by the majority of people. Which mean that their counter-strategies are mainly against security and not against obscurity.
 
I disagree.

Spammers would not necessarily try to defeat the strategy that is used by the majority of individuals. It is more than possible that they would try to defeat the strategy that is easiest to defeat. And your idea is pretty easy to defeat.




Want the best answers? Ask the best questions! TANSTAAFL!
 

Yeah, I know, but they would have to spend time on me (which is against their priority as we've seen before) and even notice that I exist (since the attackers are stupid worms that follow a pre-determined mainstream strategy).
 
That's not been my experience. My blog has a comment form, and all it has for protection is a field and a sentence saying 'put the word elbow in this field'.

It stopped every piece of comment spam dead. If it starts happening again, I'll just change the word.

The trick is that the spammers are using automated tools, so obscurity really does work in this instance. Captcha images are easy to find with automation. Parsing English isn't.

-
 
azimuth0:
The reason I'm not fond of the 30-second delay idea is that all it would take is a trivial event to figure out what the lockout is. Something as simple as my form-submitting app's hanging at the right moment could unlock that lockout.

I'm not a big fan of CAPTCHA images. To begin with, they're very difficult to implement well -- they have to be easy enough to read for a human but difficult enough to mess with a worm's OCR. (This site has some criticisms of CAPTCHA implementations they've round around the world.)

The "fill in the word" idea seems pretty good. It requires human interpretation of something simple on the site, it can be tweaked easily (examples: change the entry word; change the method from "enter the word 'elbow'" to "enter the fifth word in the sentence: 'You can't stick your elbow in your ear.'")

I think I'd recommend the "fill in the word" method over CAPTCHA images for a small site that is unlikely to attract too much human attention.

And I'd definately recommend "fill in the word" over a timed delay.


Want the best answers? Ask the best questions! TANSTAAFL!
 

Ok ...hmmm ... what about this idea :

Within the textarea, make PHP dynamically output a text string with the following caption : "please leave this security code intact in your message".

If, as you said, the worms directly inject into the scripts without passing through the form fields, then a check of the existence of the security code inside the message would suffice to detect a spam.

What do you think?
 
The reason I'm not fond of the delay idea is that you run into user-expectation management issues that require you to tell the whole world how your human-detector works. I mean, you're going to have to tell your user why it is that you're not going to let him submit his message to you now, and that you will allow him to submit shortly.

And spamming is not all worms and bots; there are human beings involved in the industry. And if one of them happens to hit your site by hand, try to submit a message to you and see the error "I'm sorry, but in order to thwart spammers, I've installed a 30-second delay before you can submit this form. You can resubmit in 11 seconds", then that spammer will get curious about how you did what you did. It will take less than 60 seconds to deduce how you did it, that others might try it, and how to work around it using only data that can be stored simply in a database.

And it can be circumvented easily. It does not require the interpretation of an image or the parsing of an English sentence.



Another method that is similar to the "fill in the blank" method is using variable form-names. Typically, a "send me a message" form has two fields, one for a subject and another for a body. Randomly pick two strings to be the names for the form fields and store the fields in a session variable. When the user submits the form, the script accepting the data can know from the session variables what fields to look for. This won't ensure that a human submitted the data, but it will at least require that the form be downloaded every time.

I would put this ahead of the 30-second delay method in terms of security, if only because your users will never know what is going on. It's less secure than the "fill in the word" idea, though.


Want the best answers? Ask the best questions! TANSTAAFL!
 

...
The reason I'm not fond of the delay idea is that you run into user-expectation management issues that require you to tell the whole world how your human-detector works. I mean, you're going to have to tell your user why it is that you're not going to let him submit his message to you now, and that you will allow him to submit shortly.
And spamming is not all worms and bots; there are human beings involved in the industry. And if one of them happens to hit your site by hand, try to submit a message to you and see the error "I'm sorry, but in order to thwart spammers, I've installed a 30-second delay before you can submit this form. You can resubmit in 11 seconds", then that spammer will get curious about how you did what you did. It will take less than 60 seconds to deduce how you did it, that others might try it, and how to work around it using only data that can be stored simply in a database.
And it can be circumvented easily. It does not require the interpretation of an image or the parsing of an English sentence.
...

Too many assumptions here ;)
In my idea, the user isn't supposed to notice what's going on. Before the delay is elapsed, the form should react as if the email was effectively sent even if it wasn't. And if we assume that worms traverse sites very quickly, I might even try to set the delay to 10 seconds instead of 30.
I'm not saying it's the perfect idea, but it's one that doesn't require the user to do extra tasks.

On a side note, I like the use of the "security code" that the user just has to leave in the textarea. Once you come back, I'll be impatient to hear your opinion about it.



 
I got an early opportunity to go online.

In my idea, the user isn't supposed to notice what's going on. Before the delay is elapsed, the form should react as if the email was effectively sent even if it wasn't.
Then if your user doesn't wait 30 seconds, your script is just going to pretend that it sent the email? That sounds pretty high-handed to me. We are talking, of course, about your site, but I simply could not implement something like a potentially-pretend email sender. I like the behavior of my sites to be more deterministic than that.


As I've said in an earlier post, I think that the idea of a "fill in the word" test is much superior to a "wait a while" requirement. This is doubly true if you're not going to tell your users they need to actually wait the time.

In terms of which I think will best protect your site, in order of most protection to least, I would rank the ideas:[ol][li]a well-designed CAPTCHA system[/li][li]a "randomly assign the names of the input field" system[/li][li]A "fill in the word" system which uses a random word and randomly-phrased question for each use of the function[/li][li]a badly-designed CAPTCHA system[/li][li]A "fill in the word" system which uses a constant word that must be periodically changed by hand[/li][li]an unannounced wait[/li][/ol]


Want the best answers? Ask the best questions! TANSTAAFL!
 

Hmmm ... I'll have to forget about the delay detection because I remembered that there are people who write their message in an offline text editor before pasting it in the form field. The idea isn't safe only if the user come back later just to paste his message. Allowing pasting or not is the only one issue with this solution because no one could possibly write a meaningful message within less that 10 or 30 seconds.

Since I don't like any idea requiring a task from the end user, I think that the best idea so far is the security code to leave intact in the textarea where the message is supposed to be typed. It's a task but at least a passive one.
Don't you think it's a viable idea?
 
wordpress uses akismet (akismet.com) which appears to use a heuristic to test the comment content for spam. i have not played with this but it might offer an alternative for discussion in this thread.

it appears to be free for personal users of wordpress and there is a licensing scheme for other uses.
 
So long as you inform your users what it is they need to do, that should work.

Here, too, is a little proof-of-concept code that shows the use of dynamically-named form-fields with fieldnames stored in session variables:

Code:
<?php
function generate_fieldname ()
{
	$retval = '';
	for ($counter = 0; $counter < 10; $counter++)
	{
		$retval .= chr (rand(65,90));
	}
	return $retval;
}

session_start();

print '<html><body>
';

if (isset($_SESSION['form_names']))
{
	if (isset($_POST[$_SESSION['form_names']['subject']]) && isset($_POST[$_SESSION['form_names']['body']])) 
	{
		print 'Send email.<br>Subject:' . $_POST[$_SESSION['form_names']['subject']] . '<br>Body:' . $_POST[$_SESSION['form_names']['body']];
		unset ($_SESSION['form_names']);
	}
	else
	{
		print "There was a problem.  I could not send the email.";
		unset ($_SESSION['form_names']);
	}
}
else
{
	print '<form method="post" action="' . $_SERVER['PHP_SELF'] . '"><br>
	';
	
	$_SESSION['form_names'] = array('subject' => generate_fieldname(), 'body' => generate_fieldname());
	
	print 'Subject: <input type="text" name="' . $_SESSION['form_names']['subject'] . '"><br>
	Body: <textarea name="' . $_SESSION['form_names']['body'] . '"><br>
	<input type="submit"></form>';
}

print '
</body></html>';
?>


Want the best answers? Ask the best questions! TANSTAAFL!
 

Thanks for your code. Unfortunately, I won't use it because I output my forms in a much more "exotic" way ;)

But it's good to know that the security code option is a good one. And I think that it's a good compromise for the end user.
 

Hi Robert,

To make it short, I use a user input error detection script of mine that needs to collect form values and names on every form element. Each form element calls a function with specific parameters.

To be frank, I could use Sleipnir's idea but I'd have to make it from scratch to make it compatible with my existing code.
I haven't much time in my hand to do that right now. Maybe later.
 
Hi all,

PMFJI

First, glyphs have gotten a very normal way to protect a site against misuse by bots and are better accepted than you think or fear.

There are more userfriendly ways of glyphs, at least you can have a button "next", so the user may generate another glyph, if the one presented to him is too hard. And for blind users you can play a short sound/song and ask for the word/sentence in it.

Although it may not prevent spammers, it's always a good idea to tell the good bots where to go and where not to go with a robots.txt

Then you can also make use of things, that are invisible to the user, but not to bots: Include links in comments, or have 1x1 pixel images with links. Especially links to dynamically generated pages, which each have links to further of those pages. This keeps bots busy and you can detect them, when they follow these links. The idea is not limited to links, you can also generate several useless forms (invisible to normal browser users) leading to other form pages. Make these the first and last links/form submits in your pages and most certainly bots will enter that trap.

Javascript can also be used to create parts of the webpage, that won't work without them. Don't assume the toolset of a spammer is a browser, as spammers are interested in finding forms fast, they may exploit, they surely will request your page with a bot, not with a browser that also loads pictures and executes script code. Bot's request the plain/text html page and parse it by themselves for links and forms. So hinder them to find the word 'submit', by generating the submit button with document.write(...'sub'+'mit'...). In the same manner you can obscure links and mail adresses.

Bye, Olaf.
 
This has been a great thread! I had previously looked at CAPTCHAs somewhat after I had to shut down a guestbook due to spam, but the CAPTCHA stuff seemed some complex (generating images and everything) that I never got time to actually implement it. But reading this thread gave me lots of new ideas. So I thought I'd let you know what I decided to implement.

I know what some of you (particularly Sleidia) are saying about the appeal of an invisible solution that doesn't require the user to fill in another field. I considered the idea of changing the names of the fields. But the battle back and forth to make better robots and then better robot-defeating code will go on indefinitely - soon the robots will be filling out the actual forms with actual browsers (even that run Javascript) rather than just sending form post data. The best defense long-term is to ensure that the one filling out the form is a living, breathing human, and the only way I can imagine doing that is to ask them to do something that's really simple for a human, but impossible (or close to it) for a machine.

So I took the "elbow" example mentioned earlier on this thread, and made it one step more secure. I made a new table in my database with questions and answers, real easy stuff that anyone who speaks English and is over the age of six can handle, like:
"the word for the sum of two plus two" -> "four"
"the first three letters of the alphabet" -> "abc"
"the number of fingers on a hand (as a word)" -> "five"
"the rodent name used for a common PC device" -> "mouse"
The one step up in security is that the answer word never appears in the question text, although probably that wouldn't matter anyway.

Then, when I build my form, I select one at random ("select * from test_human order by rand() limit 1") and put the question on the form and the answer in a session variable. Then, in the form processing script, I just check for the right answer. It's way easier than trying to make images that can outsmart an OCR robot, and I don't think it will be too much of a bother for the users - as many have said, users of forms are getting used to seeing these kind of checks.
 
... soon the robots will be filling out the actual forms with actual browsers (even that run Javascript)...

Very interesting.
Is such a thing really possible?
Isn't it a little bit far fetched?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top