That's confirmed.  sa-learn doesn't like compressed files.  I don't know if
it will dine on compressed files with the correct extension (i.e., .gz).
Unfortunately, when using compression with Maildir format, Dovecot doesn't
seem to like to use extensions.  So, I copied the directory to a temporary
location, decompressed the files and then set sa-learn on them.  Even
getting gunzip to operate on the files was a pain because it only wants
files with the .gz extension (so I had to rename all 6,000 of them first -
using a utility like 'rename').  I then did the same thing with about 9,000
hams.

There was much good news.  Learning proceeded about the same pace, but
syncing the journal to the database was *much *faster.  Maybe the tokens
were smaller?  I verified that it seemed to work with --dump magic.

Then, all by itself, Spamassassin's bayes filtering was instantly much
better.  Stuff that was tripping BAYES_00 was suddenly popping BAYES_99.

Now, I just need to update my nightly learning/reporting script.

Still, a very nice result.

On Fri, May 21, 2021 at 11:30 AM Henrik K <h...@hege.li> wrote:

> On Fri, May 21, 2021 at 10:54:54AM -0400, Clive Jacques wrote:
> > Do spamassassin or sa-learn understand compressed files or compressed
> Maildir?
>
> I believe sa-learn will automatically decompress if the files have .gz or
> .bz2 extension, but yes Maildir files without extension will not work.
>
> Should be easy to detect compressed Maildir files, perhaps file enhancement
> request in bugzilla.
>
>

Reply via email to