On Fri, 20 Jan 2017, Bill Keenan wrote:
I am interested/willing to be part of mass check. However, I use spam assassin via amavisd-new.
On Fri, 20 Jan 2017, David Jones wrote:
I would like to help with the nightly masscheck but I don't have the resources to manually check ham and spam. This also gets into the grey area of how people define spam. I also have a very good MTA setup with RBLs and DNS checks that block most of the spam before it reaches SA in MailScanner. My SA only has to block a very small percentage of my definition of spam so I am not sure how helpful my mail filtering platform can be even though it's very accurate.
Participating in masscheck is different from merely *using* SpamAssassin. The environments will likely not be associated with each other.
You will need to have a complete SA development environment kept up-to-date from SVN and compiled regularly so that you're testing the correct rules. Alternatively, if your corpora are small and you don't have concerns about possible leakage, your corpora *can* be uploaded to the SA masscheck server for central scanning. However, distributing the load is strongly desired, so this shouldn't be the default method of participation.
You will need to have manually-vetted ham and (ideally) spam, though if you have a honeypot set up (either mailbox(es) or domain(s)) that you ***know*** will not receive ham then that can be directly fed into your masscheck spam corpus.
How much resources (time, etc.) you can devote to corpora maintenance is a large determining factor in the quality of your contribution. Your masscheck corpora *must* be clean or the rule scoring will be done poorly, perhaps even poisonously.
The masscheck corpora also need to be kept fairly fresh, so it's an ongoing process.
Collecting spam after RBL filtering is much less helpful to masscheck. Ideally your spam corpus is from a totally unfiltered feed.
However, even if it is filtered and small, it helps, *especially* if the ham is not in English - masscheck is perennially starved for non-English ham and rule scoring is thus baised against non-English languages to a degree.
(however there are some honeypots in Europe feeding masscheck so that may actually be less of a problem than I believe it is...)
-- John Hardin KA7OHZ http://www.impsec.org/~jhardin/ jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 ----------------------------------------------------------------------- The most glaring example of the cognitive dissonance on the left is the concept that human beings are inherently good, yet at the same time cannot be trusted with any kind of weapon, unless the magic fairy dust of government authority gets sprinkled upon them. -- Moshe Ben-David ----------------------------------------------------------------------- 3 days until John Moses Browning's 162nd Birthday