Rob,

> Because bayes_seen was also quite big I read up on that too.
> Since the table doesn't include any age information and (most)
> everything I found says "just delete it", I emptied the table.
> Although I think it's strange to just throw away information about
> previous seen messages that have been classified as either spam or
> ham. Any other insight in this would be valued..

No need to bother with bayes_seen, just purge it every once
in a while when it grows large.

> > Some people include atime information for that purpose.
> 
> Yes, thanks.. I ran into a post that mentioned that some time after I
> posted, and added such field which will indeed do what I want. (It isn't
> going to help with the imported data though, because that info is not
> available in the original bdb files.)

The main purpose of bayes_seen is to prevent a stream of same-contents
messages arriving in a short succession from polluting a bayes database.
It is unlikely that a same contents message arrives more than once
during a long interval, and even if it does, there's not much harm
done even if re-learnt.

I believe the bayes_seen had its purpose when mail viruses were
frequent and spam messages were arriving in non-personalized
batches. These times have long since gone.

  Mark

Reply via email to