Stop Form Spam Robots

Stop Form Spam Robots Using HoneyPot and Time Measuring

Let’s stop the robots, stop form spam!

One of the most common problems when having any kind of forms on a site or blog is receiving spam. The more popular the site is, the more likely it will become a target for spamming entities and more difficult is to stop form spam. That’s why it’s important to protect it from the very beginning.

Most of the spamming entities are spam robots. It’s rare when a human sits there and fill forms manually. Taking that as an assumption, it’s relatively easy to trick those robots and make them fail their goal.

The use of Captcha is an easy and effective way of fighting spam but it comes with a cost which our users have to pay, and this is fulfilling one more step in the process.

One method that has worked for me to stop form spam is the use of HoneyPot and Time Measuring. These two techniques are powerful when used together.

So let’s start with a sample form which has 3 fields: Name, Email address and Message:

A spam robot will attempt to fill all the fields with data in order to send the form and this is what I’m going to use to our advantage.

First, let’s implement the HoneyPot. Stop form spam with sweetness!

use HoneyPot to stop form spam robots

Picture: bossfight.co

The goal here is to have a dummy field hidden by CSS and trick the spam robot into fulfilling it. Then I check in my backend validation whether the field has value, if so, then I assume that the request is spam.

Firstly, I add a field to the form and this will be my HoneyPot field. The important thing here is to make our new field to blend in. For our example, I’m going to add the “Phone number” and the label text should be “Leave this field empty”. The specific label text is for the users in browsers with the CSS disabled, so they know they have to leave it empty.

Two important things to notice.
The input field has autocomplete=”off”. The name attribute is “phoneNumber6tY4bPYk”, it contains a small hash after the “phoneNumber” part. This is to keep the browsers from inserting data by default or autocompletion.

In addition, I make then new field invisible by CSS:

Now, let’s take a look to the backend validation. I check the HoneyPot field value ($honeyPot). If it’s not empty, the most probable is that the form was sent by a spam robot:

Time Measuring technique. Stop form spam on the clock!

use Time measuring to stop form spam robots

Picture: bossfight.co

The spam robots are very fast and that’s one of their features which they use to their advantage, but we can turn the things around using their speed to make them stand out and detect them.

In order to do that, I added a new field to our form. It has to be a hidden field and the value is the encrypted timestamp of the time when the form was loaded:

Of course, the code in the Encryptor class, as well as all the other code is only for demonstration purposes. Therefore, You have to implement a more secure encryption method. So, you can find a strong encryption algorithm in this blog post.

The validation

Finally, I pick up the backend.php file and add the validations for the time measuring technique:

As a result, I receive the parameter “formLoaded6tY4bPYk” (line 8) which has the encrypted timestamp when the form was loaded. I decrypt it (line 10), so I can use the current timestamp to know how long it took the user to fill the form (line 11).
Furthermore, I make sure I receive the value of the field, and that the user didn’t spend less time than the minimum defined earlier (line 18).

Final considerations

  1. It’s a good practice to log the spam attacks for further analysis. Especially the whole serialised request information.
  2. Also, consider adding the Spam Bot IP address to your blacklist or contribute with anti-spam projects such as https://www.projecthoneypot.org/
  3. When spam activity is detected, show a generic error message like “We’re sorry, the form could not be sent at the moment. Please try gain later” rather than “SPAM detected”.
  4. All the code presented in this article is only to illustrate the idea, not intended for production.

Picture with robots by epicantus.tumblr.com

Antonio Valdez Arce

Software Development Engineer

More than 10 years of experience in the IT industry has given me the opportunity to develop the ability to learn fast. From developing desktop applications to pushing pixels in a web site, this business demands the professional to keep updated and wider the range of specialisation, that's why I constantly try to stay in contact with new technologies and encourage my peers to do so.

  • vincenzo vecchio

    Hi thanks for your post, so far the best I’ve found 😉

    • Antonio Valdez Arce

      Hello Vincenzo, thank you! I’m glad it’s useful 🙂

  • behnam bozorg

    I’ve seen a bot implementation that could check if your honeypot is visible in the view-port or not. Also, they became smart enough to reserve some delay. So you have any alternative trick or suggestion?

    • Antonio Valdez Arce

      Hello Behnam, thanks for your comment!
      You’re right. Spam protection as well as web security is an active job and it’s far from easy. Our solutions should evolve together with the attacks. We should always try to be one or more steps ahead.

      To answer your question, if the bot is smart enough to make a delay or a set of delays, then it would be impossible to stop it using the time measuring technique. But fortunately we still have the honeypot.

      In order to make it harder for the bot to recognize the honeypot field, I would randomize the class name of the form element(s). Also I’d randomize the actual position of the element. And if your application allows it, randomize the name of the element in the request.

      The fact that the spam robots are getting smarter, it’s because there are humans behind developing them. That’s why a logging mechanism is very important, it helps us to understand how our system is being attacked and create better solutions.

      After all, we humans are lazy, and the human attacker rather takes his/her bot to another victim than to spend too much effort to adapt it to attack one specific site.

      • behnam bozorg

        Hi Antonio, Thanks for the reply. I agree with your points here. What I’ll do is to resize the honeypot element to something literally invisible. I also removed the border completely. As far as I researched, bots can determine if the element is not visible somehow on the screen. Also, I’ll add at least two honeypots with different types (one textfield, one checkbox and a selectlist) This increases the possibility of them being filled by the bot. Thanks

        • Antonio Valdez Arce

          Hi Behnam. That’s a very good idea!
          I would also consider to randomize the order of the honeypot fields among the real ones, for example:

          First time the form is loaded:

          Real field 1
          Real field 2
          Honeypot field
          Real field 3

          After reloading the form:

          Real field 1
          Honeypot field
          Real field 2
          Real field 3

          This is for the case when the bot memorizes the position of the honeypot field(s).

          • behnam bozorg

            Thanks Antonio. I’ll be in touch.