At 11:23 PM +0200 06/30/2013, Benny Pedersen wrote:
does it continue if one msg is learned as spam, does it still after
say bayes_50 ?
No, it has BAYES_99 if I learn the message. That is, running SA on
the SAME message will give BAYES_99 after it's learned. It's not a
ham problem.
you should just stop going to the urls in the spam mails, one more
point is mailscanner mangle content, with here poinson bayes diggest
I am _NOT_ going to the URLs in the spam mail. I'm not sure what you
mean by that suggestion. I know MailScanner is munging the URLs, but
that is only for web bugs (not for links). Also, see below.
verify that if mails are sent first to spamassassin and mailscanner
mangle LAST in chain
The way my system is set up, there is no way to get SA to run before
MailScanner. MailScanner has to run first. It's not possible to
change this without a lot of reconfiguring, unfortunately, due to the
way the system is set up.
its very important spamassassin see original content unmangled
We had that discussion a few weeks ago -- since MailScanner munges
both ham and spam, it has essentially no effect on the Bayes score.
At 12:01 AM +0100 07/01/2013, RW wrote:
The sources of the body tokens is:
$msgdata->{bayes_token_body} =
$msg->{msg}->get_visible_rendered_body_text_array();
$msgdata->{bayes_token_inviz} =
$msg->{msg}->get_invisible_rendered_body_text_array();
which suggests it's rendered. The debug is consistent with this:
So are you saying Bayes won't see rawbody at all? It just uses body?
Or does it have tokens from both body and rawbody?
Also, what is "invisible" rendered body text? Would this include the comments?
Even if comments are invisible to the user, they should still end up
inside the body tags. Consider: on every web browser, when you "view
source," you can see comments and similar things. They are not
"rendered" in the sense that they're not displayed, but they are
certainly processed by the HTML engine. Anything within an HTML tag
is processed, which is why you can see comments when you view source.
It's still in the "body" ... just invisible.
Because of this, I would hope that HTML comments would end up within
the Bayes "body" tags even if they are invisible. Is there any way
to verify this? Since the debug output shows tokens, I guess one
could make a test email, put some markers inside comments, and see if
those markers show up in the Bayes tokenization debug output...
Thanks.
--- Amir