Hi, On Fri, Jun 14, 2013 at 4:18 PM, Amir 'CG' Caspi <ceph...@3phase.com> wrote: > At 9:43 PM -0400 06/13/2013, Alex wrote: >> >> I'd say if you have any that are hitting bayes20 or lower, your >> database is not working properly and you should probably start over. > > Not quite sure I want to do that... I don't really have a sufficient corpus > of mail for good training. It's working well in general, just missing these > particular entries. As I saw from your most recent message, you're also > getting low Bayes scores on some similar examples... so it seems like these > things are somewhat successful in confusing the Bayes analysis, at least on > some DBs and with some emails (different emails confuse different DBs).
Yeah, but not bayes20. That's bad for sure. You should start collecting now, or pull a few hundred from your recent quarantine and use those, along with people's mail folders. >> I thought you may have manually modified the body because this looks >> unique: >> >> <x-html><!x-stuff-for-pete base= >> >> Do your other FNs have this? If so, you could consider generating a >> rule from it. > > Almost all of my HTML FNs have this. However, almost all of my legitimate > HTML email (TNs) also have this (regardless of source, i.e. whether it comes > from a large company opt-in ad or whether it comes from a friend's direct > email). It would appear to be some sort of XHTML email standard. Filtering > on this would be disastrous, at least for the email I receive. Good to know. >> Search your installation and see if the two rules even exist on your >> system. > > The rules definitely exist on my system. I wonder if there's some > difference between running spamassassin manually on the message versus > running spamd. The message I pasted was run through spamc/spamd. Is there > something that I've misconfigured that might cause spamd to run differently > and skip some tests, that spamassassin would manually pick up? I think the only difference would be if spamd somehow didn't recognize all the locations for your rules. Perhaps create a rule that you know will hit with a very low score in each directory that contains rules. Maybe there's a way to run spamd in the foreground with debugging, like there is with amavisd. Regards, Alex