On 11/17/2011 10:27 AM, Marc Perkel wrote:
> I'm exploring a variety of ideas to determine the difference between
> "serious" domains down to throw away domains used by spammers. The ideas
> I'm presenting here are not complete but are just a conversation starter.

I'd think domain age would be biggest indicator of *not* being a
throw-away domain. IIRC you already track that in your own dns lists..

> for example, if the sending domain has no MX records of its own it is
> more likely spam that if there are 3 or more MX records that resolve to
> multiple IPs over more than one network. Generally spam only domains are
> minimally configured, and highly configured domains are not spam only. I
> also think that NS records might indicate that a domain is serious or not.

Maybe. Keep in mind that a throw-away domain might be purchased through
a registrar that provides email hosting, and might appear to be nicely
configured. Perhaps in combination with age, this could be a semi-useful
indicator.. ie, new domain, only one (implicit) MX, decent chance of
being a throw away. old domain, probably not a throwaway, new domain,
nicely configured is hard to tell though...

There also are free email hosting options that involve multiple MXs..
Put yourself in spammy's shoes for a moment... if you had a throwaway
domain would you host the incoming email on your own servers, or on
gmail apps? I'd choose gmail in heartbeat. Then again plenty of legit
domains use gmail too, so in and of itself its not a sign of seriousness
nor of lack thereof.

Regarding NS records, yeah there are some nameservers that seem to only
host bad domains, others that seem to only (or mostly) host good
domains, and then a whole bunch that host all kinds of domains.
Nameserver reputation might be a good track to follow IMHO.

If you're trying to predict how serious a newly registered domain is,
I'd look at the nameservers, and the whois.

Private registration, cheap registrar, cheap dns, brand new domain,
probably bad news until more info is available

Proper whois info, expensive registrar, expensive dns and/or dns with a
good reputation, brand new domain, far less suspicious.

Proper whois info, cheap registrar, expensive dns and/or dns with a good
a reputation, brand new domain, somewhere in the middle.

I'm just thinking "outloud" here, does this sound sensible to you?

> I think the serious scale could be a useful factor in SA. It doesn't
> determine if it's spam or ham in itself. Yahoo is a serious domain and
> there's lost of spam. Serious domains should not be blacklisted for
> example. We could also look for consistency. Bad RDNS from a serious
> domain might be a spam indicator.
> 
> There might be other methods of detecting serious domains. If they are
> using expensive services. Spammers would not have their dns hosted with
> Ultra DNS, or use the expensive registrars, or other services that are
> expensive.

Well, spammers might well use UltraDNS and/or expensive registrars, but
probably would only do so for their domains which they intend on keeping
long term, rather than ones they intend to throw away.

> Also - thinking we should slowly mine the whois database and provide
> some sort of DNS based lookup of whois information to be able to
> determine the registrar of a domain, the domain age, or other info that
> would be useful in determining that the domain is serious or not.

That is nothing new. Spam eating monkey AFAIK has some zones which tell
about privitized whois, age of registration, etc. You also have a zone
if I'm not mistaken that approximates how long ago a domain was first
seen (by your users) in email.  Some of the SORBS zones can also be used
in tandem with each other to extrapolate similar info.  Senderbase also
keeps tabs on history of domains as well as IPs..

Question becomes how to use the data in a way to approximate the
"seriousness" of the domain, and then how to use the approximated domain
"seriousness" to improve filtering. Two separate questions if you ask me.

> Who thinks I'm onto something?

I think you *might* be on to something, or might not.

Here's another thought off the top of my head: looking at nameservers
and calculating average age of domains hosted on the nameservers, as
well as the average length of time that the domains hosted on the
nameservers have been on those specific nameservers...

-- 
Joe Sniderman <joseph.snider...@thoroquel.org>

Reply via email to