On 1/30/2024 10:58:52, Matus UHLAR - fantomas wrote:
On 30.01.24 09:59, joe a wrote:
Advisable to "prune" Bayes data based on age?
While cleaning up recent Ham/Spam, found my "saved SPAM" goes back to
2013.
Why that's over . . . wait, I need to take off my socks . . .
So, how old is "too old". For saved SPAM?
I did retrain on old spam a few times and it was working fine.
Depends on how much mail you have:
0.000 0 7542 0 non-token data: nspam
0.000 0 80869 0 non-token data: nham
0.000 0 996032 0 non-token data: ntokens
0.000 0 1172945918 0 non-token data: oldest atime
so, even old spam mey be fine. You however need much of ham to train
otherwise everything starts looking like spam.
Recently missed spam has increased a bit, so I was dropping it into
"missed spam" and went poking through marked spam and found lots of
"missed ham". Which triggered my pondering.