从机器人隐藏电子邮件地址-保持邮件:

博士

隐藏电子邮件地址从机器人没有使用脚本和维护 mailto:功能。方法还必须支持屏幕阅读器。


摘要

  • 使用 剧本或联系方式给 没有发电子邮件

  • 电子邮件地址需要是 完全可见的人类观众和 维护 mailto:功能

  • 电子邮件地址 不得以图像形式存在

  • 电子邮件地址 必须“完全”隐藏垃圾邮件爬虫和垃圾邮件机器人和 < strong > 任何其他收割机类型


预期效果:

  • 请不要使用脚本 。项目和 我想继续保持下去中没有使用脚本。

  • 电子邮件地址要么是 显示在页面上,要么可以很容易地显示后,某种用户交互,如打开一个模态。

  • 反过来,用户可以点击电子邮件地址将触发 mailto:功能。

  • 单击电子邮件将打开用户的电子邮件应用程序。

    换句话说,mailto:功能必须正常工作。

  • 电子邮件地址在不可见或未识别为电子邮件地址的机器人 (这包括页面源)

  • 我没有收件箱里全是垃圾邮件


什么是 没有工作原理

  • 添加一个联系表格-或任何类似的-而不是电子邮件地址

我讨厌联系方式。我很少填写联系表格。如果没有电子邮件地址,我就寻找电话号码,如果没有,我就开始寻找其他服务。我只有在万不得已的情况下才会填写联系表格。

  • 用地址的图像替换地址

这给使用屏幕阅读器(请记住在您未来的项目中的视障人士)的人造成了 很大的缺点

它也 移除mailto:功能,除非你使图像可点击,然后添加 mailto:的功能作为 href的链接,但该 这就违背了初衷和现在的电子邮件是可见的机器人。


可行方案:

  • CSSpseudo-elements的巧用

  • 使用 base64编码的解决方案

  • 拆分 电子邮件地址,并在文档中分散各个部分,然后当用户单击一个按钮时将它们以模式放回到一起(这可能涉及多个 CSS类和 anchor tags的使用)

  • 通过 CSS改变 html属性

@ MortezaAsadi 在下面的评论中优雅地提到了这种可能性。这是全文的链接——这篇文章来自2012年:

如果我们可以使用 CSS 来改变 HTML 属性会怎样?

  • 其他超出我知识范围的创造性解决方案。

类似问题/修正方法

(这是 Joe Maller 提出的一个很棒的修复方案,它工作得很好,但它是 基于脚本的

<SCRIPT TYPE="text/javascript">


emailE = 'example.com'


emailE = ('yourname' + '@' + emailE)


document.write('<A href="mailto:' + emailE + '">' + emailE + '</a>')


</script>


<NOSCRIPT>


Email address protected by JavaScript


</NOSCRIPT>

(JavaScript 补丁)

所选择的答案是 。它实际上工作得很好。它涉及到将电子邮件编码为 html entities。它可以改进吗?

看起来是这样的

<A HREF="mailto:


&#121;&#111;&#117;&#114;&#110;&#097;&#109;&#101;&#064;&#100;&#111;&#109;&#097;&#105;&#110;&#046;&#099;&#111;&#109;">


&#121;&#111;&#117;&#114;&#110;&#097;&#109;&#101;&#064;&#100;&#111;&#109;&#097;&#105;&#110;&#046;&#099;&#111;&#109;


</A>

(这个超级用户问题的选择答案是伟大的,它提出了一个使用不同的模糊处理方法收到的垃圾邮件数量的研究。

它似乎操纵电子邮件地址与 CSS使其 rtl的工作。这与我在本节链接到的第一个问题中使用的方法相同。

我不确定将 mailto:功能添加到修复中会对结果产生什么影响。

  • 那么上还有许多其他的问题,它们都有相似的答案。我没有找到任何符合 我想要的效果的东西

问题:

是否有可能 提高效率(即尽可能少的垃圾邮件)的电子邮件混淆方法以上的 组合两个或多个修复程序(或者甚至添加新的修复程序),同时:

A-维护 mailto:功能; 以及

B-支持屏幕阅读器


许多 答案和评论如下提出了一个非常好的问题,同时表明不可能做到这一点,没有某种类型的 js

问/暗示的问题是:

为什么不使用 js

答案是我对 js过敏

玩笑归玩笑,

我问这个问题的三个主要原因是:

  • 联系人表格越来越多地被接受作为一种替代形式 提供电子邮件地址-他们不应该。

  • 如果它 可以做到没有脚本,那么它 应该做的没有 写剧本。

  • 好奇心: (因为我实际上正在使用一个 js修复程序)我想看看 < em > ,看看讨论这个问题是否会导致一个更好的方法。

65064 次浏览

The issue with your request is specifically the "Supporting screen-readers", as by definition screen readers are a "bot" of some sort. If a screen-reader needs to be able to interpret the email address, then a page-crawler would be able to interpret it as well.

Also, the point of the mailto attribute is to be the standard of how to do email addresses on the web. Asking if there is a second way to do that is sort of asking if there is a second standard.

Doing it through scripts will still have the same issue as once the page is loaded, the script would have been run and the email address rendered in the DOM (unless you populate the email address on click or something). Either way, screen readers will still have issues with this since it's not already loaded.

Honestly, just get an email service with a half decent spam filter and specify a default subject line that is easy for you to sort in your inbox.

<a href="mailto:no-one@example.com?subject=Something to filter on">Email me</a>

What you're asking for is if the standard has two ways to do something, one for bots and the other for non-bots. The answer is it doesn't, and you have to just fight the bots as best you can.

People who write scrapers want to make their scrapers as efficient as possible. Therefore, they won't download styles, scripts, and other external resources. There's no method that I know of to set a mailto link using CSS. In addition, you specifically said you didn't want to set the link using Javascript.

If you think about what other types of resources there are, there's also external documents (i.e. HTML documents using iframes). Almost no scrapers would bother downloading the contents of iframes. Therefore, you can simply do:

index.html:

<iframe src="frame.html" style="height: 1em; width: 100%; border: 0;"></iframe>

frame.html:

My email is <a href="mailto:me@example.com" target="_top">me@example.com</a>

To human users, the iframe looks just like normal text. Iframes are inline and transparent by default, so we just need set its border and dimensions. You can't make the size of the iframe match its content's size without using Javascript, so the best we can do is giving it predefined dimensions.

Defeating email bots is a tough one. You may want to check out the Email Address Harvesting countermeasures section on Wikipedia.

My back-story is that I've written a search bot. It crawled 105,000+ URLs during it's initial run many years ago. From what I've learned from doing that is that web crawling bots literally see EVERYTHING that is text, which appears on a web page. Bots read everything except images.

Spam can't be easily stopped via code for these reasons:

  1. CSS & JS are irrelevant when using the mailto: tag. Bots specifically look at HTML pages for that "mailto:" keyword. Everything from that colon to the next single quote or double quote (whichever comes first) is seen as an email address. HTML entity email addresses - like the example above - can be quickly translated using a reverse ASCII method/function. Running the JavaScript code snippet above, quickly turns the string which starts with: &#121;&#111;&#117;&#114;... into... yourname@example.com. (My search bot threw away hrefs with mailto:email addresses, as I wanted URLs for web pages & not email addresses.)

  2. If a page crashes a bot, the bot author will tune the bot to fix the crash with that page in mind, so that the bot won't crash at that page again in the future. Thus making their bot smarter.

  3. Bot authors can write bots, which generate all known variations of email addresses... without crawling pages & never using any starter email addresses. While it may not be feasible to do that, it's not inconceivable with today's high-core count CPUs (which are hyper-threaded & run at 4+ GHz), plus the availability of using distributed cloud-based computing & even super computers. It's conceivable that someone can now create a bot-farm to spam everyone, without knowing anyone's email address. 20 years ago, that would have been incomprehensible.

  4. Free email providers have had a history of selling their free user accounts to their advertisers. In the past, simply signing up for a free email account automatically guaranteed them a green light to start delivering spam to that email address... without ever using that email address online. I've seen that happen multiple times, with famous company names. (I won't mention any names.)

  5. The mailto: keyword is part of this IETF RFC, where browsers are built to automatically launch the default email clients, from links with that keyword in them. JavaScript has to be used to interrupt that application launching process, when it happens.

I don't think it's possible to stop 100% of spam while using traditional email servers, without using filters on the email server and possibly using images.

There is one alternative... You can also build a chat-like email client, which runs internally on a website. It would be like Facebook's chat client. It's "kind of like email", but not really email. It's simply 1-to-1 instant messaging with an archiving feature... that auto-loads upon login. Since it has document attachment + link features, it works kind of like email... but without the spam. As long as you don't build an externally accessible API, then it's a closed system where people can't send spam into it.

If you're planning to stick with strictly traditional email, then your best bet may be to run something like Apache's SpamAssassin on a company's email server.

You can also try combining multiple strategies as you've listed above, to make it harder for email harvesters to glean email addresses from your web pages. They won't stop 100% of the spam, 100% of the time... while also allowing 100% of the screen readers to work for blind visitors.

You've created a really good starting look at what's wrong with traditional email! Kudos to you for that!

A good screen reader is JAWS from Freedom Scientific. I've used that before to listen to how my webpages are read by blind users. (If you hear a male voice reading both actions [like clicking on a link] & text, try changing 1 voice to female so that 1 voice reads actions & another reads text. That makes it easier to hear how the web page is read for the visually impared.)

Good luck with your Email Address Harvesting countermeasure endeavours!

First, I don't think doing anything with CSS will work. All bots (except Google's crawler) simply ignore all styling on websites. Any solution has to work with JS or server-side.

A server-side solution could be making an <a> that links to a new tab, which simply redirects to the desired mailto:

That's all my ideas for now. Hope it helps.

Here is an approach that does make use of JavaScript, but with a rather small foot-print. It's also very "ghetto", and generally I would not recommend an approach with inline JS in the HTML except you have an extreme reluctance to use JS, at all.

<a
href="#"
data-contact="bGUtZW1haWxAdGhlLWRvbWFpbi5jb20="
data-subj="QW4gQW1hemluZyBTdWJqZWN0"
onfocus="this.href = 'mailto:' + atob(this.dataset.contact) + '?subject=' + atob(this.dataset.subj || '')"
>
Send an email
</a>

data-contact is the base64 encoded email address. And, data-subj is an optional base64 encoded subject.

The main challenge with doing this without JS is that CSS can't alter HTML attributes. (The article you linked is a "pie-in-the-sky" musing and does not have any bearing on what is possible today or in the near future.)

The HTML entities approach you mentioned, or some variation of it, is likely the simplest option that will have some efficacy. Additionally, the iframe approach is clever and the server redirect approach is pretty awesome. But, all three are vulnerable to bots:

  • The HTML entities just need to be converted (and detecting that is simple)
  • The document referenced by the iframe might simply be followed
  • The server redirect might simply be followed, as well

With the approach outlined above, the use of a base64 encoded email address in a data-contact attribute is very "one-off" – as long as the scraper is not specifically designed for your site, it should work.

Short answer to fulfill all your requirements is that it's impossible

Some of the script-based options answered here may work for certain bots, but you wanted no-script, so, no, you can't.

Simple + Lot of @ + Editable without tools

<a href="mailto:user@domain@@com"
onmouseover="this.href=this.href.replace('@@','.')">
Send email
</a>

Have you considered using google's recaptcha mailhide? https://www.google.com/recaptcha/admin#mailhide

The idea is that when a user clicks the checkbox (see nocaptcha below), the full e-mail address is displayed.

While recaptcha is traditionally not only hard for screen readers but also humans as well, with the roleout of google's nocaptcha recaptcha which you can read about here as they relate to accessibility tests. It appears to show promise with to screen readers as it renders as a traditional checkbox from their view. Nocaptcha reCAPTCHA

Example #1 - Not secure but for easy illustration of the idea

Here is some code as an example without using mailhide but implementing something using recaptcha yourself: https://jsfiddle.net/43fad8pf/36/

<div class="container">
<div id="recaptcha"></div>
</div>
<div id="email">
Verify captcha to get e-mail
</div>


function createRecaptcha() {
grecaptcha.render("recaptcha", {sitekey: "6LcgSAMTAAAAACc2C7rc6HB9ZmEX4SyB0bbAJvTG", theme: "light", callback: showEmail});
}
createRecaptcha();


function showEmail() {
// ideally you would do server side verification of the captcha and then the server would return the e-mail
document.getElementById("email").innerHTML = "email@example.com";
}

Note: In my example I have the e-mail in a JavaScript function. Ideally you would have the recaptcha validated on the server end, and return the e-mail, otherwise the bot can simply get it in the code.

Example #2 - Server side validation and returning of e-mail

If we use an example more like this, we get additional security: https://designracy.com/recaptcha-using-ajax-php-and-jquery/

function showEmail() {
/* Check if the captcha is complete */
if ($("#g-recaptcha-response").val()) {
$.ajax({
type: ‘POST’,
url: "verify.php", // The file we’re making the request to
dataType: ‘html’,
async: true,
data: {
captchaResponse: $("#g-recaptcha-response").val() // The generated response from the widget sent as a POST parameter
},
success: function (data) {
alert("everything looks ok. Here is where we would take 'data' which contains the e-mail and put it somewhere in the document");
},
error: function (XMLHttpRequest, textStatus, errorThrown) {
alert("You’re a bot");
}
});
} else {
alert("Please fill the captcha!");
}
});

Where verify.php is:

$captcha = filter_input(INPUT_POST, ‘captchaResponse’); // get the captchaResponse parameter sent from our ajax


/* Check if captcha is filled */
if (!$captcha) {
http_response_code(401); // Return error code if there is no captcha
}
$response =     file_get_contents("https://www.google.com/recaptcha/api/siteverify?secret=YOUR-SECRET-KEY-HERE&amp;amp;response=" . $captcha);
if ($response . success == false) {
echo ‘SPAM’;
http_response_code(401); // It’s SPAM! RETURN SOME KIND OF ERROR
} else {
// Everything is ok, should output this in json or something better, but this is an example
echo 'email@example.com';
}

The one method I found effective is using it with CSS like below:

<a href="mailto:myemail@ignore-domain.com">myemail@<span style="display:none;">ignore-</span>example.com

and then write a JavaScript to remove the ignoreme- word from the href="mailto:..." attribute with regex. This will hide email from bot as it will append ignore- word before real domain and this will work on screen reader and when user clicks on the link custom JS function will remove the ignore- word from href attribute so it will open the real email.

This method has been working very effectively for me till date. you can read more on this - http://techblog.tilllate.com/2008/07/20/ten-methods-to-obfuscate-e-mail-addresses-compared/

PHP solution

function printEmail($email){
$email = '<a href="mailto:'.$email.'">'.$email.'</a>';
$a = str_split($email);
return "<script>document.write('".implode("'+'",$a)."');</script>";
}

Use

echo printEmail('test@example.com');

Result

<script>document.write('<'+'a'+' '+'h'+'r'+'e'+'f'+'='+'"'+'m'+'a'+'i'+'l'+'t'+'o'+':'+'t'+'e'+'s'+'t'+'@'+'g'+'m'+'a'+'i'+'l'+'.'+'c'+'o'+'m'+'"'+'>'+'t'+'e'+'s'+'t'+'@'+'g'+'m'+'a'+'i'+'l'+'.'+'c'+'o'+'m'+'<'+'/'+'a'+'>');</script>

P.S. Requirement: user must have JavaScript enabled

based on the code of MaanooAk, here is my version:

<a href="mailto: Mike Myers"
onclick="this.href=this.href.replace(' Mike ','MikeMy'); this.href=this.href.replace('Myers','ers@vwx.yz')">&#9993; Send Email</a>

The difference to MaanookAks version is, that on hover you don't see mailto: and a broken email adress but mailto: and the name of contact. And when you click on it, the name is replaced by the email adress.

In the code the email adress is splitted into two parts. Nowhere in the code the email adress is visible complete.

Here is my new solution for this. I first build the email adress string by addition of small pieces and then use this string also as title:

adress = 'mailt' + 'o:MikeM' + 'yers@v' + 'wx.yz';
document.getElementsByClassName('Email')[0].title = adress;
function mail(){window.location.href = adress;}
<a class='Email' onclick='mail()'>&#9993; Send Email</a>

I use this in a footer of a website. Many pages with all the same footer.