Understanding the hostKarma Lists

Marc Perkel Tue, 29 Sep 2009 07:23:30 -0700

Responding to a lot of questions here. The lists contain both host namesand IP addresses. IP addresses everyone understands. So I'll talk abouthost names. Wells Fargo Bank - for example - (wellsfargo.com - is in thewhite list as is all of Wells Fargo's hosts. This bank sends nothing but100% good email. But to avoid spoofing of pointer records you have touse Forward Confirmed RDNS (FcRDNS).


1.2.3.4 PTR --> mail.example.com
mail.example.com A --> 1.2.3.4


This is nearly impossible to spoof.

Same it true for yellow lists. If the FcRDNS resolves to hotmail.com,yahoo.com, gmail.com then you can skip all other IP testing because theIP address tells you nothing about if it is or isn't spam.


Warren Togami wrote:

On 09/28/2009 10:07 PM, Marc Perkel wrote:

I'd like to keep the name HOSTKARMA as standard.


If that's so, then we probably want that in the spamassassin rule
name. Your wiki page suggests JMF is the name. A number of people
probably already configured their spamassassin using your suggested
JMF rule names and they would need to be educated to remove it.

How about these for rule names, so the rule names are not too long?

RCVD_HOSTKARMA_BL Black
RCVD_HOSTKARMA_WL White
RCVD_HOSTKARMA_YL Yellow
RCVD_HOSTKARMA_BR Brown

I'm willing go go with whatever name works better for the community. Iwill change my wiki to be consistent.

Hi Marc,
I appreciate your desire for everyone to wholly benefit from yourwork, but please let us implement this for spamassassin in stagesstarting from the lowest hanging fruit.
First please confirm that you approve of the above new rule names, ifyou don't want it to be known as JMF.

Yes - or whatever works best. I can change my wiki to reflect consensus.

Hi Warren,

No one has actually implemented the rules for my blacklists correctly.
My lists support both IP and hostname lookups. The hostname assumes that
you have forward confirmed the RDNS so that you eliminate those who
might spoof.
Please explain in greater detail? Can this be determined wholly fromthe Headers and message body after the MTA had passed the mail to theMDA?

Yes - it does require 2 DNS calls to do this for FcRDNS. You need a PTRcall to get the RDNS and an A record call to confirm it.


Yellow means that the IP or hostname contains no useful information as
to spam or no spam. On my system once I determine a host is yellow I
skip all blacklists and whitelists tests. Yellow is for Yahoo, Hotmail,
Gmail, etc where the IP has no information and all host tests are
meaningless.

My NoBL list is similar to yellow except that you can skip black list
lookup but maybe might be whitelisted somewhere.

Please help me better understand, what are examples of a sequence ofevents that would land an IP address on the NoBL?

NoBL is determined a number of ways. NoBL is what most RBLs call whitelisting in that it means don't include it in any black list. To me whitelist means a spam free source. People who remove their IP manually usingmy form will be on the NoBL list. Or it might be what I have determinedthat there is some good email coming from the IP and they may be acandidate for white listing but I have yet to determine that. Yellowlisting is where I know they should not be black listed but I also knowthey should not be white listed. (yahoo, gmail, hotmail). NoBL is whereI know they should not be black listed but might be white listed.

An important point to understand here is that I don't use my own listsin Spam Assassin. I do most of my filtering with Exim rules. I use mylists to avoid using SA to reduce system load. SA sees mostly yellowlisted hosts.

If you just want to score points then Black, White, and Brown can be
assigned points. Yellow should be zero points regardless of how ittests.
I am aware that Yellow isn't useful for scores. It is however usefulfor statistical analysis in masschecks, and it doesn't costspamassassin any more to print if it hits. In particular I'm lookingto see if there are any reliable trends of overlap between Yellow andother spamassassin rules.

Fair enough. I just didn't want you assigning points to a yellow listingbecause the results would be false.


I think the real power of my lists is in the host name lookups. It would
be worthwhile to implement that.


Please describe how this is more effective than IP lookups?

I don't have a list of IP addresses that Yahoo uses. However, if theFcRDNS resolves to yahoo then I can skip all other RBL resting because Iknow it's a yahoo source. Same is true of white and black listed hostnames. On my system if a host name lookup returns yellow, then I add thesending IP to my yellow lists for those using IP lookups. Same with theother colors.

I think my white listing is very accurate at this point. The thing about
white servers is that they aren't evasive like spammers. There should be
some short circuiting options to reduce system load on SA for white
lookups.
Generally spamassassin does not short-circuit by default for anyreason. There is an option to do so, but I think it is only to stoptesting rules if the score goes beyond a certain point. Please file aseparate bug for this if it is important to you.

I'm just making a suggestion. SA is a high load program. If you areprocessing a lot of email then you will need a lot of servers if you useSA on everything. However if you can prescreen the email blocking whatyou are sure is spam and passing what you are sure is good then you canprocess a lot more email with far fewer servers.

Understanding the hostKarma Lists

Reply via email to