On Tue, Jan 14, 2020 at 12:05:57PM +0000, Nix wrote: > On 8 Jan 2020, Benjamin Block told this: > > > Now, if I run sa-learn again on the same folder (the manual says > > "SpamAssassin remembers which mail messages it has learnt already, > > and will not re-learn those messages again, unless you use the --forget > > option.", so I think this is OK to do), it gets absurdly > > slow, taking over 2 minutes for the same directory with 45 mails. > > > > + /usr/bin/sa-learn --no-sync --progress --ham > > /var/spool/fetchmail/Maildir/.Congstar > > 92% [============================= ] 0.30 msgs/sec 02m40s DONE > > Learned tokens from 0 message(s) (49 message(s) examined) > > > > Now imagine this for a folder with over 2k messages (of which I have > > several). > > Possibly related to <https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7587>?
Ah yes, I saw that as well, and thought it might be related. But I saw they made changes in response to the bug, so I wasn't sure that still applies. > > > Jan 8 23:49:52.209 [308] dbg: TxRep: reputation: none, count: 0, learning: > > -20, MSG_ID: > > ec300f7aa9c95003b94439831b843605e9a94660@sa_generated > > Jan 8 23:49:52.209 [308] dbg: auto-whitelist: add_score: new count: 1, new > > totscore: 20 > > Jan 8 23:49:53.710 [308] dbg: auto-whitelist: DB addr list: untie-ing and > > unlocking > > Jan 8 23:49:53.715 [308] dbg: auto-whitelist: DB addr list: file locked, > > breaking lock > > Jan 8 23:49:53.716 [308] dbg: locker: safe_unlock: unlink > > /var/spool/fetchmail/.spamassassin/tx-reputation.lock > > ... looks like it to me. It's at least spotting the lock and breaking > it, but it's still taking a second and a half to do it, and it happens > for each message. That's better than the 90s it used to take, but still > bad. > > I've come to the conclusion that TxRep is essentially unmaintained and > basically doesn't work unless you use SQL storage, and have migrated > back to the AWL, which still works fine. I hope I'm wrong. Hmm, interesting. Maybe I should try SQL then to see whether its faster with that. Makes my setup more complex though, not a huge fan of that, but OK. Thanks, - Benjamin