Re: LONGWORDS not hitting?

Amir 'CG' Caspi Sun, 30 Jun 2013 20:37:15 -0700

At 11:23 PM +0200 06/30/2013, Benny Pedersen wrote:

does it continue if one msg is learned as spam, does it still aftersay bayes_50 ?

No, it has BAYES_99 if I learn the message. That is, running SA onthe SAME message will give BAYES_99 after it's learned. It's not aham problem.

you should just stop going to the urls in the spam mails, one morepoint is mailscanner mangle content, with here poinson bayes diggest

I am _NOT_ going to the URLs in the spam mail. I'm not sure what youmean by that suggestion. I know MailScanner is munging the URLs, butthat is only for web bugs (not for links). Also, see below.

verify that if mails are sent first to spamassassin and mailscannermangle LAST in chain

The way my system is set up, there is no way to get SA to run beforeMailScanner. MailScanner has to run first. It's not possible tochange this without a lot of reconfiguring, unfortunately, due to theway the system is set up.

its very important spamassassin see original content unmangled

We had that discussion a few weeks ago -- since MailScanner mungesboth ham and spam, it has essentially no effect on the Bayes score.


At 12:01 AM +0100 07/01/2013, RW wrote:

The sources of the body tokens is:
$msgdata->{bayes_token_body} =$msg->{msg}->get_visible_rendered_body_text_array();
$msgdata->{bayes_token_inviz} =$msg->{msg}->get_invisible_rendered_body_text_array();
which suggests it's rendered. The debug is consistent with this:

So are you saying Bayes won't see rawbody at all? It just uses body?Or does it have tokens from both body and rawbody?


Also, what is "invisible" rendered body text?  Would this include the comments?

Even if comments are invisible to the user, they should still end upinside the body tags. Consider: on every web browser, when you "viewsource," you can see comments and similar things. They are not"rendered" in the sense that they're not displayed, but they arecertainly processed by the HTML engine. Anything within an HTML tagis processed, which is why you can see comments when you view source.It's still in the "body" ... just invisible.

Because of this, I would hope that HTML comments would end up withinthe Bayes "body" tags even if they are invisible. Is there any wayto verify this? Since the debug output shows tokens, I guess onecould make a test email, put some markers inside comments, and see ifthose markers show up in the Bayes tokenization debug output...


Thanks.

                                                --- Amir

Re: LONGWORDS not hitting?

Reply via email to