ch to all-manual learning and hopefully convince enough
users to send in spam and false positives to train it well. Sufficient
participation is a big question, but appears to be the only viable option at
this point.
Wes
ing failed expires
due to 'deadlock detected'.
Regrouping, I was looking at benchmarks for QDBM and see it is on the "we
need volunteers" list. Is this more than just changing the "tie" in the
Bayes DBM store module?
Wes
0-80% CPU constantly.
Wes
as dropped dramatically. With 163,000 loaded, it is down
to 100/second. I decided to start with a clean DB and let auto-learn
repopulate it.
Wes
shows up every couple of days.
I guess the flip side is that if a message is manually learned, and then you
continue to get messages in like that (at least more than the turnover
frequency), then the manually-learned information should stay active.
Correct?
Wes
ks can be avoided by sorting
the keys to be updated so that they are always updated in the same order
(and/or retrying should a deadlock be detected).
Wes
't reasonable, though. I can't see (at least
here) that manual learning would get any kind of significant volume.
Someone's only going to send in a message for manual learning if it is a
leaked spam or a false positive, and then only if they bother to do it. I'd
be surprised if the manual learning volume was 1 in 10,000 of the messages
going through the auto-learning.
Wes
epends on what the update vs. read load is.
I would think it would be extremely useful to be able to treat
manually-learned rules separately from auto-learned rules. In a high volume
environment, you'd want to keep manually learned rules far longer than you
could possibly keep auto-learned ones. Manually learned rules should be
more important.
Wes
es it handle
concurrency, if it has to update the last access time of tokens and learn
new tokens? Are there any numbers on concurrent servers when it starts to
bog down?
Wes
sn't large enough
> it is going to churn so fast that it'll defeat the purpose of even
> having a bayes database.
I had pretty much come to that conclusion, but all the posts I found were
talking about token databases in the low hundreds of thousands, and I've
been seeing millions... Wasn't sure I wasn't overlooking something big.
Wes
-learning.
If this is true, then that tells me that with our volume, we either have to
do all automatic learning, or all manual learning. With both enabled, any
manual learning would likely be lost within less than a day. Ugh.
Wes
ow unlocking lock
[21506] dbg: locker: safe_unlock: unlocked
/home/smfs/.spamassassin/bayes.mutex
[21506] dbg: bayes: expiry completed
bayes: synced databases from journal in 0 seconds: 927 unique entries (927
total entries)
expired old bayes database entries in 432 seconds
3702653 entries kept, 1230354 deleted
token frequency: 1-occurrence tokens: 83.22%
token frequency: less than 8 occurrences: 12.56%
Wes
ion, won't
they also be subject to the (short) expiration period, or is manual learning
kept permanently?
Thanks
Wes
13 matches
Mail list logo