TLD 可能有多长?

我正在使用 PHP 编写一个电子邮件验证正则表达式,我需要知道 TLD 可能有多长时间仍然有效。我做了一些搜索,但找不到太多关于这个话题的信息。那么一个 TLD 可能会有多长呢?

54602 次浏览

-EDIT-

According to RFC 2606 .localhost is reserved domain name and its length is 9 characters. That is the longest I am aware of.

-END OF EDIT-

However, I think that you should care about email address length and not only TLD length. Below is a quote from this article. The email address length is 254 characters:

There appears to be some confusion over the maximum valid email address size. Most people believe it to be 320 characters (64 characters for the username + 255 characters for the domain + 1 character for the @ symbol). Other sources suggest 129 (64 + 1 + 64) or 384 (128+1+255, assuming the username doubles in length in the future).

This confusion means you should heed the 'robustness principle' ("developers should carefully write software that adheres closely to extant RFCs but accept and parse input from peers that might not be consistent with those RFCs." - Wikipedia) when writing software that deals with email addresses. Furthermore, some software may be crippled by naive assumptions, e.g. thinking that 50 characters is adequate (examples). Your 200 character email address may be technically valid but that will not help you if most websites or applications reject it.

The actual maximum email length is currently 254 characters:

"The original version of RFC 3696 did indeed say 320 was the maximum length, but John Klensin (ICANN) subsequently accepted this was wrong."

"This arises from the simple arithmetic of maximum length of a domain (255 characters) + maximum length of a mailbox (64 characters) + the @ symbol = 320 characters. Wrong. This canard is actually documented in the original version of RFC3696. It was corrected in the errata. There's actually a restriction from RFC5321 on the path element of an SMTP transaction of 256 characters. But this includes angled brackets around the email address, so the maximum length of an email address is 254 characters."

The longest with latin letters is .MUSEUM (source), but there are some with special characters. The longest from them is XN--CLCHC0EA0B2G2A9GCD. Also, in a short time, it will be possible to reserve your own TLD for a high price and so it will be possible to be longer.

DNS allows for a maximum of 63 characters for an individual label.

The longest TLD currently in existence is 24 characters long, and subject to change. The maximum TLD length specified by RFC 1034 is 63 octets.

To get the length of the longest existing TLD:

wget -qO - http://data.iana.org/TLD/tlds-alpha-by-domain.txt | tail -n+2 | wc -L

Here's what that command does:

  1. Get the latest list of actual existing TLDs from IANA
  2. Strip the first line, which is a long-ish comment
  3. Launch wc to count the longest line

Alternative using curl thanks to Stefan:

curl -s http://data.iana.org/TLD/tlds-alpha-by-domain.txt | tail -n+2 | wc -L

This is PHP code to get up-to-date vertical bar separated UTF-8 TLDs list to be used directly in a regular expression:

<?php
function getTLDs($separator){
$tlds=file('http://data.iana.org/TLD/tlds-alpha-by-domain.txt');
array_shift($tlds); // remove heading comment
usort($tlds,function($a,$b){ return strlen($b)-strlen($a); }); // sort from longest to shortest
return implode($separator,array_map(function($e){ return idn_to_utf8(trim(strtolower($e))); },$tlds));
}
echo getTLDs('|');
?>

To match a host name you could use it like this:

$tlds=getTLDs('|');
if (preg_match("{([\da-z\.-]+)\.($tlds)}u",$address)) {
..
}

Since I'm a .net developer following is the java-script representation of determining the longest TLD currently available.this will return the length of the longest TLD which you would be able to use in your RegEx.

please try the following Code Snippet

function getTLD() {
var length = 0;
var longest;
var request = new XMLHttpRequest();


request.open('GET', 'http://data.iana.org/TLD/tlds-alpha-by-domain.txt', true);
request.send(null);
request.onreadystatechange = function () {
if (request.readyState === 4 && request.status === 200) {
var type = request.getResponseHeader('Content-Type');
if (type.indexOf("text") !== 1) {
var tldArr = request.responseText.split('\n');
tldArr.splice(0, 1);


for (var i = 0; i < tldArr.length; i++) {
if (tldArr[i].length > length) {
length = tldArr[i].length;
longest = tldArr[i];
}
}


console.log("Longest >> " + longest + " >> " + length);
return length;
}
}
}
}
<button onclick="getTLD()">Get TLD</button>

A TLD can be any length at all. New TLDs happen all the time. In the future there will be more TLDs not regulated by the entity currently regulating the majority of TLDs. We also won't use email in the future as we presently do. That said:

You don't need to validate an email address ever. If you want to slow people down and have an idea as to whether they're actually human, include a CAPTCHA. If you need to confirm working email, send an email with a validation link they can open. If you aren't throttling submissions of things that can generate things like emails being sent for verification, it won't matter whether you're confirming the address is technically valid anyway, it will be abused at that point regardless.

Longest TLD up to date is .xn--vermgensberatung-pwb, at 24 characters in Punycode and 17 when decoded [vermögensberatung]. With no Punycode it would be .northwesternmutual or .travelersinsurance both at 18 characters.

However, a domain name, the thing that goes before an TLD, can be up to 63 characters long, as seen here: http://www.thelongestdomainnameintheworldandthensomeandthensomemoreandmore.com