Prevent email address harvesting (part 2)

Prevent email address harvesting (part 2)



How to prevent email address harvesting is something that web masters often ask when figuring out how to block email address harvesting programs and robots from harvesting or stealing email addresses from their or their client's websites. The How to prevent email address harvesting question can be answered with a few programming tips.

Successful implementation of these tips can help significantly in improving your web site content security.

This page is the second part to our prevent email address harvesting tutorial. You can find the first page here: How to protect web pages from email harvesting.

How to prevent email address harvesting - tips 2

The following is a list of tips that can help you to prevent email harvesting. Go to the link referenced above for more.

Email address in HTTP redirect

One way to prevent email address harvesting is to write a server-side script to return the mailto:foo@example.com link as a HTTP redirect. All modern browsers recognize mailto in the page header but not every harvester is capable of understanding this. Here is an example showing how this can be done in PHP. You display your email as for example:

<a href="email_address.php">This is my email address, click here.</a>

The content of the email_address.php file is the following:

<?php
header ("Location: mailto:foo@example.com");
exit();
?>

Remember that if you are running Apache, you need to have the mod_rewrite module enabled for this to work. When the visitor clicks the link in the A HREF, it will call the email_address.php file which displays mailto:foo@example.com in his or her browser's address bar.

prevent email address harvesting

If the visitor's computer is set up properly, its mail application should be able to capture the email address and populate the To: field with it.

prevent email harvesting

The advantage of this approach is that the email address is not directly visible at the web page, but, theoretically, some harvesters might be able to get the email address from the page header.

Email address and mailto as JavaScript

This is another common technique to prevent email address harvesting. Instead of using the plain <a href=mailto:address></a> HTML tag, you would write out the same using a JavaScript. There are numerous ways of doing this in JavaScript; however, the concept is the same. The idea behind preventing email address harvesting is to break the email address into parts which cannot be easily parsed from the source code by the email address harvesting program. And the beauty of JavaScript is that this can be done in many ways. The easiest script would be:

<script type="text/javascript">
 document.write("foo@example.com")
</script>

This script displays the email address in the browser. The email address is not clickable, and the nice thing is that it does not include the mailto attribute which is what email address harvesting programs are looking for. The bad thing is that the email address is still visible and can be easily parsed if the spam bot is set up to look for the @ symbol. The next step in our security cook-book is to split the email address into pieces.

<script type="text/javascript">
 document.write("foo" + "&#x0040" + "example" + "com")
</script>

In this example, the email harvester would need to be smart enough to join the individual strings and also to translate the &#x0040 entity into the @ symbol. (Note, how did we come up with the &#x0040? Take a look at the ASCII table and try our ASCII to hex converter.)

If you are still worried that the email harvester might strip out the " + " and reassemble the email link, there is always something more you can do. Variables can have information added to their existing contents and that new content can even include the existing variable. If you wanted to make the code more complicated, you could use something like the following:

<script type="text/javascript">
var string1 = "foo";
var string2 = "@";
var string3 = "example";
var string4 = ".";
var string5 = "com";
var string6 = string1 + string2 + string3 + string4 + string5;
document.write("<a href=" + "mail" + "to:" + string1 + string2 + string3 + string4 + string5 + ">" + string6 + "</a>");
</script>

This looks pretty challenging for the email harvesting program, does it not? You can even combine this method with email faking and using images. Here is another example of what can be done in JavaScript.

<a href='javascript:window.location="mail"+"to:"+"foo"+"@"+"example"+"."+"com";'
onmouseover='window.status="mail"+"to:"+"foo"+"@"+"example"+"."+"com"; return true;'
onmouseout='window.status="";return true;'>Click here to send mail.</a>

The drawback of the JavaScript method is that the email address is visible on screen in browsers which support JavaScript only. Those browsers that have JavaScript turned off or do not support JavaScript would not display the email address at all.

Email address via CSS2 pseudo-element :after

Here is another great technique that you can use to prevent email address harvesting. First, you would define your CSS code:

.emailDiv:after { content: foo@example.com; }

The class can be defined either in the HEAD of your page or in your *.css file. You can of course substitute the @ symbol with an entity. Once you have your CSS defined, you would display your email address as follows:

<div style="emailDiv">This is my email address: </div>

This code would be displayed in the browser as This is my email address: foo@example.com. The dark side of this technique is that only browsers that can interpret CSS2 will display the address. MSIE as of the end of 2008 does not display this, Firefox works ok.

Email address through CSS2 unicode-bidi (text direction)

Another technique to prevent email address harvesting is based on changing the direction of the text. The key in this email address harvesting prevention method is to change the direction of text from left-to-right (default) to right-to-left. First, you would define your CSS code:

div.codedirection { unicode-bidi: bidi-override; direction: rtl; }

and then you would display the email address on your page as

<div class="codedirection">moc.elpmaxe@oof</div>

The browser will display the email address as foo@example.com. The nice feature of this method is that you can have your email addresses in your *.css file, it means separately from your HTML code. This method will display the email backwards for those browsers without CSS2 support which could be quite bothersome to invert.

Stuff email address with CSS display:none

Display none is another nice technique to prevent email address harvesting. In this case, we just interlace the email address with some text that we later remove from the body of the email with display none when rendering it in the browser. First, you would define your CSS:

.hideThisText { display:none; }

Now you would use this CSS class in your text.

foo@bar<div class="hideThisText">[REMOVETHIS]</div>.com

The browser would display the email address as foo@example.com. Browsers that support the display:none property (most browsers do) will not display the [REMOVETHIS] to the user. Those browsers that do not support display:none will show the [REMOVETHIS] and the user will hopefully remove it before sending email to the address. The email is textually available to the user; however, the user cannot click a link in order to open their email client.

Use forms for emails

If you want to completely prevent email address harvesting, using forms is the best option. In this case, no email address is displayed at the website. The user has to contact you by filling out a form in which case a server-side scripting process forwards the data from the form to your email. Your email address is very safe. Spam robots simply pass this area as it contains no email address in the source code.

The disadvantage with this method is that your form can get spammed with content spammers, but that is another story. You would need to protect your form with captcha.

Which method is the best?

It depends. First of all, please note that there are many variations to the above methods, and they can be combined to produce a unique solution. Whatever you do at your website, please, keep in mind to fully test your code in all major browsers and their versions too.

Remember that some methods are limited in their accessibility. Web pages can be accessed not only through MSIE, Firefox, Opera, and Netscape on a x86 based PC, but they can be accessed on MACs, through Linux and Unix, and you might have some visitors through cell phones and other hand-held devices too. Visually impaired people use text readers.

When choosing the right method for your application, it does not necessarily need to be the one that is most complicated. Theoretically, email harvesters could write code that can break or decode every method listed here. If something can be engineered, it can always be reverse-engineered. However, consider the size of the source code that harvester would need to have to account for every method listed here and multiply that by the number of sites/pages a bot has to go through in order to have a good number of emails collected. Accounting for every method listed here would call for utilization of extreme resources. So, most email address harvesting programs are very easy and primitive to be small and fast. With minimal measures, a greater portion of harvesters can be fooled.

All these methods have their pros and cons, and the method you pick is entirely up to you. We will be glad if you share your experience with us and let us know your feedback - Maxi-Pedia Forum.

.

Discuss this article or this topic in our discussion forum:
(The table bellow shows a list of 8 most recent topics posted in our discussion forum. Visit our discussion forum to see more. It is possible the links below are not related to this page, but you can be certain you will find related posts in the discussion forum. You can post one yourself too.)
Email this article to a friend:
TO: 
FROM: 
2 + 6 - 3 = 
.
How can I link to this web page?

It is easy, just include the code provided below into your HTML code.

<a href="http://www.maxi-pedia.com/prevent+email+address+harvesting" title="www.Maxi-Pedia.com: Prevent email address harvesting (part 2)" target="_blank">Prevent email address harvesting (part 2)</a>
.