Hi, On Sun, Sep 25, 2016 at 6:18 PM, John Hardin <jhar...@impsec.org> wrote: > On Sun, 25 Sep 2016, Alex wrote: > >> On Sun, Sep 25, 2016 at 4:54 PM, Sean Greenslade >> <s...@seangreenslade.com> wrote: >>> >>> On Sun, Sep 25, 2016 at 04:46:28PM -0400, Alex wrote: >>>> >>>> >>>> I have another rule with a questionable score that's hitting too much >>>> ham. >>>> >>>> From: "Customer Support" <customer.supp...@e.heritageparts.com> >>>> dbg: rules: ran header rule __FROM_WORDY ======> got hit: >>>> "Customer.Support@" > > It is causing those hams to be incorrectly classified as spam?
Yes. X-Spam-Status: Yes, score=6.008 tag=-200 tag2=5 kill=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FROM_WORDY=2.699, HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001, LOTS_OF_MONEY=0.001, MIME_HTML_ONLY=0.723, NUMERIC_HTTP_ADDR=1.242, RCVD_IN_DNSWL_NONE=-0.0001, RELAYCOUNTRY_US=0.01, RP_MATCHES_RCVD=-0.5, SPF_PASS=-0.001, T_DMARC_TESTS_PASS=0.01, URI_HEX=1.122] autolearn=disabled > BAYES_50? Are you training ham? :) Yes :-) Does this hit bayes00 for you? # sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 35152 0 non-token data: nspam 0.000 0 21542 0 non-token data: nham 0.000 0 4600265 0 non-token data: ntokens 0.000 0 1324316802 0 non-token data: oldest atime 0.000 0 1474845999 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 1474783813 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count I recently deleted the database of 11M tokens, disabled autolearn, and have been retraining it for quite a while now.