On Tue, Jan 14, 2020 at 12:05:57PM +0000, Nix wrote:
> On 8 Jan 2020, Benjamin Block told this:
>
> > Now, if I run sa-learn again on the same folder (the manual says 
> > "SpamAssassin remembers which mail messages it has learnt already,
> > and will not re-learn those messages again, unless you use the --forget 
> > option.", so I think this is OK to do), it gets absurdly
> > slow, taking over 2 minutes for the same directory with 45 mails.
> >
> > + /usr/bin/sa-learn --no-sync --progress --ham 
> > /var/spool/fetchmail/Maildir/.Congstar
> >  92% [=============================        ]   0.30 msgs/sec 02m40s DONE
> > Learned tokens from 0 message(s) (49 message(s) examined)
> >
> > Now imagine this for a folder with over 2k messages (of which I have 
> > several).
>
> Possibly related to <https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7587>?

Ah yes, I saw that as well, and thought it might be related. But I saw
they made changes in response to the bug, so I wasn't sure that still
applies.

>
> > Jan  8 23:49:52.209 [308] dbg: TxRep: reputation: none, count: 0, learning: 
> > -20, MSG_ID:
> > ec300f7aa9c95003b94439831b843605e9a94660@sa_generated
> > Jan  8 23:49:52.209 [308] dbg: auto-whitelist: add_score: new count: 1, new 
> > totscore: 20
> > Jan  8 23:49:53.710 [308] dbg: auto-whitelist: DB addr list: untie-ing and 
> > unlocking
> > Jan  8 23:49:53.715 [308] dbg: auto-whitelist: DB addr list: file locked, 
> > breaking lock
> > Jan  8 23:49:53.716 [308] dbg: locker: safe_unlock: unlink 
> > /var/spool/fetchmail/.spamassassin/tx-reputation.lock
>
> ... looks like it to me. It's at least spotting the lock and breaking
> it, but it's still taking a second and a half to do it, and it happens
> for each message. That's better than the 90s it used to take, but still
> bad.
>
> I've come to the conclusion that TxRep is essentially unmaintained and
> basically doesn't work unless you use SQL storage, and have migrated
> back to the AWL, which still works fine. I hope I'm wrong.

Hmm, interesting. Maybe I should try SQL then to see whether its faster
with that. Makes my setup more complex though, not a huge fan of that,
but OK.

Thanks,
 - Benjamin

Reply via email to