Alternatives to totally banishing
mailto:
There are alternatives to completely removing email addresses,
but they all depend on the stupidity of the spambot, and so
could be compromised by a new generation of pest. These include:
* Write out email addresses in a non-email format, e.g. instead
of writing 'user@example.com' you would write 'user at example
dot com', or something similar. It would only take some spambot
with a little more intelligence to be able to scan these patterns
and pick up "likely" addresses, so this strategy
is a little risky. Any consistent method you choose to write
out email addresses could in theory be analyzed and decoded
by a savvy bot.
* URL-encode email addresses (suggested by Anthony Martin).
Most browsers allow the mailto: URL to contain URL Encoded
values: The string of "%40" equals the at "@"
symbol, while "%2E" equals period. For that matter,
you could URL encode the entire address, name, host, domain,
so it's one long encoded string. This is something that might
work short term, but it's relatively easy for spambots to
get smarter to decode this.
* Use HTML character entities email addresses (suggested
by Seann Herdejurgen). Similar to the previous method. For
example, <a href=mailto:user@example.com>user@example.com</a>
* Add stuff to the email address to make it invalid, but
so that a human could easily know what to do to make it work.
An example of this is writing 'username@_NO_SPAM_example.com'.
You need to remove the "_NO_SPAM_" part to make
the email address valid. You can have some kind of explanation
to make it clear what people have to do to use the address.
Personally, I don't like this - you're depending on a level
of sophistication on the part of your users which is risky.
In my experience, there are a lot of very 'novice' level users
out there, who only know how to click on a link. They don't
know how to edit an email address. Heck, I've had people come
to my site by typing the URL into Google, rather than the
'Location' box of their browser. Also, people don't read instructions.
* Make graphics images which contain the email address. Spambots
usually don't download graphics, and even if they did, they
probably couldn't decode the bits to get the text. However,
they could do it in theory, since software for doing OCR (optical
character recognition, getting text from scanned documents)
has been around for a while. A downside to this approach is
that the user has to manually copy down the email address,
since it can't be cut'n'pasted. Also, you can't put a mailto:
link on the image, otherwise you're back to square one. Finally,
blind people (who use braille browsers) will have a BIG problem
with pure graphics (unless you put in some kind of ALT text,
using the techniques previously mentioned to obfuscate the
email address). You could also put a link to a contact form,
with an argument in the link telling your server internally
what email address to use. For example, the link could say
"contact.cgi?to=23", where '23' is some database
key to the actual email address. But the downside here is
that you still need to generate the image, which is a bit
of a pain in the ass if you have a lot of them. You can do
it automatically, if you're willing to put the work in and
write the scripts. There are some very nice graphics generation
packages out there on CPAN for Perl. Here's an example of
an email address presented as an image:
Robert Logan tells me that the PBM package (which seems to
be packaged with Linux) is a great way to generate these graphics,
for example:
shell> echo user@example.com | pbmtext | pnmcrop | pnmpad
-white -l2 -r2 -t2 -b2 > email.pnm
shell> convert email.pnm email.gif
This produces the following, which looks pretty neat and
tidy:
An alternative to this (suggested by Andrew Park) is to just
make certain characters into graphics, which can then be used
again and again for all kinds of email addresses. For example,
you could make a GIF of the '@' symbol, and possibly other
common parts such as ".com" and ".org".
If you have code on the server side that can then automatically
convert email addresses into the appropriate HTML, then this
will fool most spambots (for now!).
* Use JavaScript to make your email links hard to recognise
for spambots. I personally don't like my site to be dependent
on JavaScript, since I turn it off in my own browser (mostly
for security reasons and to avoid the popup and popunder ads).
But, there have been a number of methods suggested for doing
this, for example:
o From Marcell Toth:
<html>
<script language="javascript">
function SendMail(Login, Server)
{
window.navigate("mailto:" + Login + "@"
+ Server);
}
</script>
<body>
<a href="javascript:SendMail('marcell.toth', 'nextra.hu')">Mail
me</a>
</body>
</html>
o A JavaScript email encryptor (thanks to Joe Tucek for the
link)
o From Brandon Gillespie:
There is a fourth means of dealing with the mailto: link
I didn't see mentioned,
but which I have had good success with. Instead of doing href="mailto:foo@bar"
you
create an obfuscated javascript function for each domain (for
me they are all mailed
to the same domain, so its easy), like:
function m_sfcon (u) {
pre = "mail";
url = pre + "to:" + u;
document.location.href = url + "@sfcon.org";
}
Then use:
href="javascript:m_sfcon('myusername')"
* Some other interesting ideas:
o From Thomas "Balu" Walter:
While working on my new hompepage I found myself asking me
how to defend against those bots.
I didn't want to break my eMail-address or to hide it using
javascript or images -
especially because my visitors should be able to use mailto:
links as expected.
My provider set up a "catchall" mailbox where all
mails are stored that are sent
to my domain @example.com. Since I am developing my pages
using PHP I thought of
a way to make them unique for each visitor. The result was
the following small function:
function generateMail(){
global $HTTP_SERVER_VARS;
// is a proxy in use?
if ($HTTP_SERVER_VARS["HTTP_X_FORWARDED_FOR"]) {
$ip = $HTTP_SERVER_VARS["HTTP_X_FORWARDED_FOR"];
} else {
$ip = $HTTP_SERVER_VARS["REMOTE_ADDR"];
}
return "web-".sprintf("%u", ip2long($ip)).".".time()."@example.com";
}
This generates an address in the form
web-32bitIP.timestamp@example.com
This way I can easily reject addresses that were found by
bots and are used for SPAMming.
I even know where the bot came from and when. I can even find
them in the webserver-logfiles
and analyze their activity.
o From Ilmari Karonen (in response to the update regarding
newer versions of spambots which use google to find pages,
and then follow no links on my website, thus foiling any link
traps):
If the spambot is indeed not following links, an obvious
solution is to
feed all mailto: links through a redirector script.
On a site I'm currently building, I'm doing the following:
1. All email links are given as "/email/?h=host&u=user".
2. The directory /email is disallowed in robots.txt.
3. Any URL under /email which is _not_ in the above format
acts as a
spambot trap.
4. All pages contain links to "/email/something_random.html".
This works great as long as there are no e-mail addresses
visible on the
page. I'm currently obfuscating those by inserting the HTML
code
<font size="1" color="white" style="font-size:
1px;">X</font>
on either side of the @ sign. I figure a bot has to be pretty
damn
clever to de-obfuscate that, while it's pretty obvious to
a human even
if the CSS hiding trick fails.
As you can see, there are many ways you can make email addresses
harder for spambots to recognise. It all depends on your own
expertise and preferences. Still, in my opinion the only totally
safe way to ensure spambots can't harvest email addresses
is to totally remove them from your website! Can't get around
that one, no matter how smart they get...
|