PakOgah wrote:
>
> Matt Kettler wrote:
>> Mário Gamito wrote:
>>  
>>> Hi,
>>>
>>> How can i know how many messages did already sa-learn processed ?
>>>     
>> You mean the total number of messages learned in the bayes database
>> (includes sa-learn and autolearn)?
>>
>> sa-learn --dump magic
> and how do I read this information ?
> # sa-learn --dump magic
> 0.000          0          3          0  non-token data: bayes db version
Bayes DB is in the version 3 format. (it's changed a couple times in
history, but hasn't changed recently)
>
> 0.000          0        569          0  non-token data: nspam
You have trained 569 nonspam messages
> 0.000          0          7          0  non-token data: nham
You have trained 7 spam messages, which is very few, not enough for SA
to be willing to start using the bayes database to rate mail yet.. by
default you need 200 (and I do not recommend changing it to anything
lower except in lab tests to study bayes errors in under-trained
databases.).
> 0.000          0      53898          0  non-token data: ntokens
There are 53,898 total tokens in the bayes database. (small, but not
absurdly so. By default SA aims to keep it between 150k and 100k.
Looking above, you've not trained enough emails for SA to start
considering throwing out old tokens to keep it under 150k.)
> 0.000          0  987802486          0  non-token data: oldest atime
> 0.000          0 1176482771          0  non-token data: newest atime
The least-recently used token in the database was last accessed
987802486 seconds after January 1st, 1970, and the most-recent was
accessed at 1176482771. (not very interesting except to compare against
each other)
> 0.000          0          0          0  non-token data: last journal
> sync atime
> 0.000          0          0          0  non-token data: last expiry atime
> 0.000          0          0          0  non-token data: last expire
> atime delta
> 0.000          0          0          0  non-token data: last expire
> reduction count
>
There's never been a journal sync or expiration of old tokens.

In a young database this is reasonably normal, although I'd eventually
expect a journal sync after you've got enough nonspam for your bayes to
become actively used by SA. Also, you'll never get expiry until your
database is a bit larger. Expiry doesn't kick in until you've got
150,000 tokens, and you've got about a third of that.

Reply via email to