Re: SpamAssassin Rules Regarding Abuse of New Top Level Domains

Joe Quinn Wed, 14 Oct 2015 09:13:55 -0700

On 10/14/2015 12:00 PM, Bill Cole wrote:

Describe, in detail, the new SA technology which fights abuse of newTLDs.
Prior to v3.4.1, the mechanism for detecting and parsing hostnames toidentify body URIs used an embedded array of hardcoded domains inMail/SpamAssassin/Util/RegistrarBoundaries.pm. This resulted in manyURIs in the new TLDs not being detected and filtered as URIs. Inv3.4.1 there is the new Mail/SpamAssassin/RegistryBoundaries.pm andthe file 20_aux_tlds.cf in the canonical rules set which now containsa comprehensive maintained list of TLDs and other registry-manageddomains.

A mention of why the list is even needed:

Most URLs are obvious and of the form"http://sub.domain.tld/blahblahblah"; and easy to detect. However, mailclients will also accept things like "sub.domain.tld/blahblahblah"without the protocol. We want to detect as many URLs as possible andideally zero non-URLs, because each can turn into multiple DNS lookups.The list of TLDs gives us a way to eliminate obvious non-URLs, but itwas designed when the worst we had to deal with was 100-ish ccTLDs thatrarely changed. Nowadays it's easy for spammers to buy up garbagedomains like example.bacon / example.click / example.industries, makingan up to date list of TLDs much more important.

Re: SpamAssassin Rules Regarding Abuse of New Top Level Domains

Reply via email to