Hi,

On Sun, Sep 25, 2016 at 6:18 PM, John Hardin <jhar...@impsec.org> wrote:
> On Sun, 25 Sep 2016, Alex wrote:
>
>> On Sun, Sep 25, 2016 at 4:54 PM, Sean Greenslade
>> <s...@seangreenslade.com> wrote:
>>>
>>> On Sun, Sep 25, 2016 at 04:46:28PM -0400, Alex wrote:
>>>>
>>>>
>>>> I have another rule with a questionable score that's hitting too much
>>>> ham.
>>>>
>>>> From: "Customer Support" <customer.supp...@e.heritageparts.com>
>>>> dbg: rules: ran header rule __FROM_WORDY ======> got hit:
>>>> "Customer.Support@"
>
> It is causing those hams to be incorrectly classified as spam?

Yes.

X-Spam-Status: Yes, score=6.008 tag=-200 tag2=5 kill=5 tests=[BAYES_50=0.8,
        DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
        FROM_WORDY=2.699, HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001,
        LOTS_OF_MONEY=0.001, MIME_HTML_ONLY=0.723, NUMERIC_HTTP_ADDR=1.242,
        RCVD_IN_DNSWL_NONE=-0.0001, RELAYCOUNTRY_US=0.01,
        RP_MATCHES_RCVD=-0.5, SPF_PASS=-0.001, T_DMARC_TESTS_PASS=0.01,
        URI_HEX=1.122] autolearn=disabled

> BAYES_50? Are you training ham? :)

Yes :-) Does this hit bayes00 for you?

# sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0      35152          0  non-token data: nspam
0.000          0      21542          0  non-token data: nham
0.000          0    4600265          0  non-token data: ntokens
0.000          0 1324316802          0  non-token data: oldest atime
0.000          0 1474845999          0  non-token data: newest atime
0.000          0          0          0  non-token data: last journal sync atime
0.000          0 1474783813          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire
reduction count

I recently deleted the database of 11M tokens, disabled autolearn, and
have been retraining it for quite a while now.

Reply via email to