Some discussion of the underlying issue. On Tue, 2014-09-09 at 02:59 +0200, Karsten Bräckelmann wrote: > At the time of the 3.3.2 release, the .club TLD simply didn't exist. It > has been accepted by IANA just recently. Of course I was conveniently > using a trunk checkout for testing and kind of shrugged off that TLD in > question. > > FWIW, this is not actually a 3.3.x issue. It's the same with 3.4.0. Yes, > that is a *recent* TLD addition... *sigh*
Unlike the util_rb_[23]tld options, the set of valid TLDs is actually hard-coded. It would not be a problem to make that an option, too. Which, on the plus side, would make it possible to propagate new TLDs via sa-update. Not only 3.3.x would benefit from that, but also 3.4.0 instances. Plus, it would be generally faster anyway. There is one down side: A new dependency on Regexp::List [1]. The RE pre-compile one-time upstart penalty should be negligible. The question is: Is it worth it? WILL it be worth it? This incidence is part of the initial round of IANA accepting generic TLDs. There's hundreds in this wave, and some are abused early. This is moonshine registration, nothing like new TLDs being accepted in the coming years. Or is it? Will new generic TLDs in the future be abused like that, too? How frequently will that happen? Is it worth being able to react to it quickly? How long will URIBLs take to list them? How long will it take for the average MUA to even linki-fy them? Opinions? Discussion in here, or should I move this to dev? I guess I'd be happy to introduce to you... util_rb_tld. [1] Well, or a really, really f*cking ugly option that takes a pre-optimzed qr// blob containing the VALID_TLDS_RE. -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}