Coding Archives - elexicon

May

Preventing Spam Without Captcha

Posted by: Elexicon

Spam emails are extremely annoying. Unfortunately, spambots are getting smarter and smarter every day. People have developed some pretty clever methods to prevent spam, but the most popular are also an inconvenience to your users. I’m speaking, of course, about Captcha.

According to the Captcha website, Captcha is

“a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs cannot.”

Typically, the test will consist of distorted text embedded in an image.

But what if the user can’t read the distorted text produced by Captcha? It becomes a nuisance to hit the refresh button multiple times to get a legible Captcha so you can submit the form. Being in the user-experience business, we went looking for a better solution. We looked into several different technologies, but almost all of them were too bloated in size for us to find them appealing, so we had to come up with another way.

First, some explanation about spam bots: they will typically fill out every input field in a form whether or not it is visible to the user. This useful piece of information has led to the creation of the honeypot method. The honeypot method consists of putting a blank input field in your form and hiding it from the user. The bot will come across this input field and fill it in. If the form field is filled in, the sender should be marked as a bot and the form should not send. Unfortunately, setting an input field with display: none; isn’t enough to combat spam anymore. It is certainly a step forward, but as I mentioned above, spam bots are getting smarter every day and many of them have figured this little trick out and can work around it.

To prevent spam on our new forms, we used a few different “honeypot” methods combined into one form to determine if the user is a bot or not. We first implemented the standard blank honeypot text input field and set it to display: none;. When the form is submitted, we then perform a server-side check using PHP to see if the input was filled out. If so, we trigger an error and prevent the form from sending. After testing this method, we waited a couple of days to gauge its effectiveness. Unfortunately, we were still receiving some spam emails each day using this out-of-the-box honeypot functionality.

We then began investigating more solutions to combat spam. This is where we really learned just how smart these spam bots are becoming. When we had first implemented the honeypot input field, we had named the input “anti-spam”. This was a bad idea. Bots are able to read through the input attributes and determine what type of input is focused and, apparently, are able to determine if the input field is supposed to be filled out based upon the name. So we learned a valiable lesson: name your honeypot fields something completely irrelevant to combating spam.

After changing the honeypot input name to something like “promo_code” we waited another day or two to see if we had better results. A day or two went by without spam, but on the second or third day we received another spam email. This was an improvement from the previous rate, but still unacceptable.

That’s when we realized some bots can bypass input fields set to display: none;. So diving headlong deeper into the spam battle, we implemented a method found in MailChimp’s subscription form that sets the honeypot input field to position: absolute; left: -5000px;. This allowed the input field to be “visible” but positioned off screen so the normal user couldn’t see it. We weren’t going to stop there, though. We’d had enough of those emails trying to sell us shoes and prescription drugs.

To be sure no bots could send our contact form, we implemented second and third honeypot inputs. The second honeypot field was an HTML5 email input. I do believe this field was the key in preventing our spam emails. Bots, no matter how smart, will ALWAYS fill out an email input. Just be sure to name your honeypot email input field something different than your actual email input field and name it something totally unrelated to spam prevention. We used the name “email_2” for our honeypot email input. This email input was “hidden” the same way the other input field was, by setting it in a <div> with position: absolute; left: -5000px; set. If either one of these input fields were populated, the form would trigger an error and not send.

The third and final method we used to combat spam is something a little more in-depth and tricky than your average honeypot input. When a spam bot finds a form on a page it will typically fill it out within 5-10 seconds and submit. This is much faster than what a human can do, so we figured it would be wise to do a check against how long it took to fill out the form. To do this, we put in a hidden input field and set the value to populate upon page load with the time the page was loaded. When the form is submitted, we perform a server-side check if the time between loading the page and submitting the page is larger than the minimum time it takes to fill out the form. We set this minimum time to 10 seconds to be sure we weren’t going to prevent any real users from sending our form. If the form is submitted in under 10 seconds, it will trigger and error and not send.

It’s been just over a week since we’ve implemented these spam prevention techniques and we haven’t seen a single spam email come through since. As spam bots continue to evolve, however, we may have to revisit this solution down the line. But that’s part of the spam arms race that’s not going away any time soon.

Definitions / category: coding

Preventing Spam Without Captcha

Categories

Tags

Archives