On 28.05.19 00:13, hg user wrote:
The server was installed and configured by a "zimbra man", a person I fully
trust. Since I manage a commercial antivirus/antispam solution that is not
properly working for the italian language, I was tasked to join the project
in order to understand if we could switch from the proprietary solution to
spamassassin.

I'm now in the process of double-checking the configuration of spamassassin
and feeding the bayes engine...

Testing the system I noticed that spamassassin logged the internal MTAs
(including the antivirus server) as external and I asked *the zimbra man*
to correct the configuration. He replied it was not necessary. Sorry I
didn't specify I asked the person in charge of the system.

I believe that that is not necessary, because zimbra takes the control
itself, uses modified SA source.

If your "spamassassin" binary is not the one from zimbra, it's apparently
the reason why you have probvlems with trustparh configuration and also the
bayes database.

I don't recommend mixing usage of zimbra's internal SA and SA installed from
elsewhere.

Unfortunately, spamassassin documentation is not really clear and asking
google can be even more confusiong... I found posts stating that nham/nspam
reported by --dump magic are either tokens or messages... according with a
test I did this afternoon, feeding 2 messages to sa-learn ham, those
numbers are tokens.

0.000          0       5232          0  non-token data: nspam
0.000          0      70408          0  non-token data: nham
0.000          0     388070          0  non-token data: ntokens

I believe first two are counts of mail, last one is count of tokens and also
that it's self-explanatory.

I noticed that the nham counter kept increasing for several minutes after
sa-learn ended, probably due to the --no-sync parameter... this could also
explain why immediately after the sa-learn of the spam message bayes
reported BAYES_50 and a few minutes later BAYES_00: the engine was still
learning and as new tokens were recorded they changed the result.

In the end, I need to think about the answer of RW: spamassassin is run by
amavis but with no internal servers defined, it uses my internal one as the
external. Received header needs some more care, and probably also the list
of stop words should be expanded. Probably there is a ratio behind some
decisions taken by the developers, but I can't fully understand how the
destination address can help on whether a message is spam or not, at least
not 6 times.

Tomorrow I will try some -D bayes on different messages to try understand
better what the plugin is doing, and I will try to read all the source
code. Unfortunately I don't know perl...

Probably the best solution is to change the configuration, zap the bayes db
and sa-learn all the corpus I put apart....

I recommend to ask zimbra forums when you are messing up with zimbra's bayes
database and zimbra SA settings.

On Mon, May 27, 2019 at 8:06 PM Matus UHLAR - fantomas <uh...@fantomas.sk>
wrote:

On 27.05.19 18:04, hg user wrote:
>I was writing a message requesting advice on bayes_ignore_header since I
>was sure something was wrong when I decided to have a look at spamassassin
>-D bayes output... and I was shocked by what I saw !
>
>x-spam-relays-external lists all the hops of the message *including*
internal
>servers and so x-spam-relays-internal is empty...  I specifically asked to
>add the antivirus and other internal MTAs to the internal list...

how?

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
The only substitute for good manners is fast reflexes.

Reply via email to