Hi,

> Is this normal? If so, what is the explanation for this behavior? I have

> marked dozens of nearly-identical messages with the subject "Garden hose
>> expands up to three times its length" as SPAM (over the course of
>> several weeks) as SPAM, and yet SA reports "not enough usable tokens
>> found".
>>
>
> If they are identical, I don't believe it will create new tokens, per se.
>
>
>
>> Is SA referring to the number of tokens in the message? Or the Bayes DB?
>>
>
I should also mention that while training a message, use "--progress", as
such (assuming you're running it on an mbox or message that's in mbox
format):

# sa-learn --progress --spam --mbox mymboxfile

It will show you how many tokens have been learned during that run. It
might also be a good idea to add the token summary flag to your config:

add_header all Tok-Stat _TOKENSUMMARY_

If you run spamassassin on a message directly, and add the -t option, it
will show you the number of different types of tokens found in the message:

X-Spam-Tok-Stat: Tokens: new, 0; hammy, 6; neutral, 84; spammy, 36.

Regards,
Alex

Reply via email to