> Herb asked: > >What makes you uncomfortable? There really are only two issues: > >1) Delay of legitimate email > >2) Broken legitimate servers that won't resend > > Herb, > > Many web page event-driven e-mails may not have a re-try > mechanism. And I think that some legit opt-in lists and > newsletters also might be sent on a 1-pass scenario. ... in > addition to what you've conceded.
But these web page emails are not likely to be on RBLs if they are running on legitimate email servers AND are not open relays etc, and many/most such pages actually send their email through some SMTP server precisely so they will not need to deal with retries etc. I worried about this with my OWN customer web page email (to me only) but realized that I was going to have it use a reliable SMTP server anyway. (One that an outside sender cannot use.) But whitelisting can take care of that... > I don't see how a "greylisting-whitelist" could keep up with > the multitude of scenarios, even if these are all low > frequency percentage-wise. There just aren't that many in our experience -- greylistd comes with a nice whitelist (or broken servers) and you can add more. But this does not simplify things so it is important for you to know that we have only done this once right after installing the greylist daemon. It just doesn't happen that we lose mail from web forms. How much LEGITIMATE email do you get (or does anyone get) from "unknown" web forms? > However, I do see how your method is an excellent way to both: > > (1) minimize the problems of greylisting by going after what > is almost always spam in the 1st place > > AND > > (2) minimize the problems of FPs on RBLs since the legit mail > will almost always make it past the greylist. > > Therefore, I don't knock your system... maybe I just need to > test the waters and get a little more comfortable with it. Testing is good. Depending on your email server this is easy -- with Exim I just used a WARN on the greylist for a couple of days, then switched it to DEFER (which is a temp reject as opposed to a DENY.) By the way there is an "add-on SA Plugin" that does greylisting but I don't see the point of that as SA is NOT really a spam filter but a content classifier and only gives your MTA etc the info it needs to decide what to do with the email. Also, for us SpamAssassin is TOO LATE in the chain 95% of the time. We've already greylisted most things by the time SA runs (and thus avoid the expense of SA processing if the mail is not from a reasonably functional SMTP server.) > I fact, I probably will do this eventually... and I'm most > interested in putting it to use on only those messages which > just barely got caught and/or which just barely didn't get > caught by my current spam filtering. I'm kind of excited to > see if this can squeak a few tenths of a percent better > filtering out of my filter without generating FPs. It will (help that is) but the key to what I am suggesting (and doing) is that YOU get to decide. We still use SpamAssassin for all the hard cases (and even to drive a small amount of email through the greylist). If I had to give up one it would be Greylisting since SpamAssassin is a more comprehensive and general tool. So, all of those "excellent" RBLs that you trust can still be used to REJECT, and the flaky ones that Greylist doesn't stop can be used to SCORE in SpamAssassin. This latter is true for ANY test you choose. Some are reliable enough to reject (or even accept) and some are still going to get by Greylisting (about 10% of what we feed through greylisting "comes back" again -- and truthfully almost none of that is actually Ham.) But of course, in our system NONE of what is rejected by such "suspicion driven greylisting" is HAM.. > ONE MORE THOUGHT: > In general, I think that they greylist-only people (NOT Herb, > btw) are just lazy and are willing to make due with an > inferior system which is brainless and easy... But this also > the problem with brain-less Bayesian-only filters and, > consequently, the spammers found ways to beat the Bayesian > filters. You know, there is a similarly easy way for them to > beat greylisting, too... Defense in depth is the key. Trying to do that efficiently is critical for systems with high mail volumes. One of the useful features of my greylist mechanism is that is REDUCES the load both in receiving mail bodies AND in processing them through SpamAssassin. Oddly enough my BAYES_99 and BAYES_95 in SA give 0% false positives and hits 70-80% of spam, pretty good for BAYES_99 +BAYES_95 but if this seems low to others it's important to remember we are knocking down (over?) 90% of Spam before SA even sees it. One of the big advantages of this method is not that SA couldn't classify it but rather no human needs to later review the greylist defers that never return. If there WERE an FP, the sender would (almost certainly) get a Non-Delivery report from his OWN email server and we NEVER send NDRs thus avoiding adding to the collateral spam (back scatter) by accepting and later trying to notify the supposed "sender". > simply track the "451 4.7.1" responses and send these again a > few minutes later. When this becomes common practice, those > who rely on greylisting will find their filters failing big time. Yes. That is absolutely correct and perhaps I should not spend time explaining how to make safe use of greylisting (as long as only a few of us do this the spammers will just keep slamming everyone else <grin>) But do realize that this is a significant improvement to the current crop of spam zombies AND it will cause them to have to work harder and expend more resources to get the same "benefit" they do now. Slowing the zombies down is NOT a bad thing. This week I have written a SpamAssassin plugin for CRM114, a Markovian and Hyperspace classifier (akin to Bayes classification but with a perhaps more comprehensive classifier). It works and it will ADD to my defense in depth. This thing is actually running in my production SA, and adding/subtracting score based on it's classification. It's not suitable for everyone yet since it is still crude (idiosyncratic to my systems) and has to call an external executable (which is unsuitable for high volume mail systems.) -- Herb Martin