As you all know I'm in the spam blocking business and looking to share my information with others to help them block spam for everyone. I'm currently feeding my spam to several people now.

So - looking to expand this now that I feel like I'm not losing the spam battle anymore. (Thanks to FuzzyOCR and other new tricks).

So - let me describe my setup. I actually do most of my spam filtering with Exim rules. Using Exim I can identify a huge amount of both spam and ham without having to use SA, which is expensive resource wise. However SA is still very important to my setup as it gets whatever I can't get using Exim rules.

I do front end filtering for about 3000 domains. Mail comes in, I clean it, and forward it onto the destination server. In the process I reject millions of spams a day. But what I'm doing is capturing some of the spam and feeding it to others who provide blacklist services to everyone else. This seems to be working well and I want to expand it.

What I have is several feeds depending on what kind of spam you are looking for. One feed is mostly from virus infected zombies suitable for blacklisting the server. Another feed is spam that I have determined using SA that often comes from servers like gmail, yahoo, comcast and hotmail. This feed isn't suitable for IP based blacklists but is good for mining URI blacklists and message fingerprinting.

One think I'm doing is just bouncing the easy stuff. If the server is already listed at spamhaus I don't see any reason to forward it. Much of this spam is from servers not already listed on the other high quality lists. So this is "new" spam. Perhaps the reciently infected or exploited and not easilly trapped. The volume of spam is about 200,000 message per day.

I also enhance the headers storing the sending host's IP address in a separate header for blacklist mining. There are also headers giving detailed information as to why the message was classified as spam.

So - here's the deal. If you are running a service where you provide a world accessible black list to the general public then I want to give you this feed for free. Many of you are better at processing this than I am. If you are running a commercial spam filtering service for your customers only then I want to sell you the feed for a reasonable cost.

No feed is 100% perfect. But the IP based zombie feed is very close. The other spam feed is also very good too but will have more FPs than the first list. I don't send all my spam, just the stuff that has a very high score. You are welcome to do your own checking to verify the feed.

I am also able to extract specific parts like just lists of IP addresses that should be blocked. And I'm open to suggestions about how to better provide data.

Feedback welcome.

Reply via email to