At 11:23 PM +0200 06/30/2013, Benny Pedersen wrote:
does it continue if one msg is learned as spam, does it still after say bayes_50 ?

No, it has BAYES_99 if I learn the message. That is, running SA on the SAME message will give BAYES_99 after it's learned. It's not a ham problem.

you should just stop going to the urls in the spam mails, one more point is mailscanner mangle content, with here poinson bayes diggest

I am _NOT_ going to the URLs in the spam mail. I'm not sure what you mean by that suggestion. I know MailScanner is munging the URLs, but that is only for web bugs (not for links). Also, see below.

verify that if mails are sent first to spamassassin and mailscanner mangle LAST in chain

The way my system is set up, there is no way to get SA to run before MailScanner. MailScanner has to run first. It's not possible to change this without a lot of reconfiguring, unfortunately, due to the way the system is set up.

its very important spamassassin see original content unmangled

We had that discussion a few weeks ago -- since MailScanner munges both ham and spam, it has essentially no effect on the Bayes score.

At 12:01 AM +0100 07/01/2013, RW wrote:
The sources of the body tokens is:

$msgdata->{bayes_token_body} = $msg->{msg}->get_visible_rendered_body_text_array();

$msgdata->{bayes_token_inviz} = $msg->{msg}->get_invisible_rendered_body_text_array();

which suggests it's rendered. The debug is consistent with this:

So are you saying Bayes won't see rawbody at all? It just uses body? Or does it have tokens from both body and rawbody?

Also, what is "invisible" rendered body text?  Would this include the comments?

Even if comments are invisible to the user, they should still end up inside the body tags. Consider: on every web browser, when you "view source," you can see comments and similar things. They are not "rendered" in the sense that they're not displayed, but they are certainly processed by the HTML engine. Anything within an HTML tag is processed, which is why you can see comments when you view source. It's still in the "body" ... just invisible.

Because of this, I would hope that HTML comments would end up within the Bayes "body" tags even if they are invisible. Is there any way to verify this? Since the debug output shows tokens, I guess one could make a test email, put some markers inside comments, and see if those markers show up in the Bayes tokenization debug output...

Thanks.

                                                --- Amir

Reply via email to