On Sat, 2008-08-02 at 15:58 +0700, Fuad NAHDI wrote:
> Hi all,
> 
> I have postfix (ver 2.3.3) with mysql, virtual users, amavisd-new, clamav
> and spamassassin (ver 3.2.5), dcc,  pyzor and razor running on  centos
> 5.1.
> Everything works fine but spamassassin + amavisd-new frequently give a
> high score for emails coming from Japan (using Japanese
> character/language).

Sneak preview of the comments below:  Part of the reason Japanese mail
is scored high on your system is, because you trained your Bayes to
believe it is spam, and you are seriously punishing senders from .jp
domains.  But read on.


> Sample X-Spam-Status:
> --------------------
> X-Spam-Status: Yes, score=11.732 tag=x tag2=5 kill=8 tests=[AWL=0.404,
>      BAYES_99=3.5, DBL_12_LETTER_FLDR=0.2, DBL_12_LETTER_PGIMG=0.2,

Your Bayes is trained badly. Use sa-learn to correct it, and learn
Japanese ham as ham.

>      FM_FRM_RN_L_BRACK=2.674, FM_MULTI_ODD2=1.1, FM_WHITEONWHITE=0.45,

Neither these DBL_*, nor the FM_* rules are part of stock SA. With a
notable exception of FM_FRM_RN_L_BRACK.

>      HELO_EQ_JP=1.244, HOST_EQ_JP=1.265, HS_INDEX_PARAM=0.001,

The *_EQ_JP rules are not part of stock SA. Given your complaint, you
seriously should not use these.

>      HTML_IMAGE_RATIO_06=0.001, HTML_MESSAGE=0.001,
>      HTML_NONELEMENT_40_50=0.944, MIME_HTML_ONLY=1.457, SARE_RAND_2=2.5,

Bad sending MUA, composing HTML mail with no text/plain part.

>      SARE_URI_BARGAIN=0.634, SARE_URI_LET_DIG_PIC=1.157,
>      USER_IN_WHITELIST_TO=-6]

SARE_* rules are not part of stock SA.


> Any advices will be apreciated.

Train your Bayes, learn ham mail. Also, drop your AWL database and start
fresh, since it currently maintains an average score of about 12 for
that particular sender.

Get rid of third party rules, if they don't apply to your particular
mail stream. Seriously, reconsider *all* third party rules and review
their performance on *your* mail. This is a problem you created
yourself, not an issue with SA.


Granted, assuming a neutral Bayes score, no AWL and no whitelisting,
stock SA still scores that example at 5.078. Slightly beyond the
threshold.

However, properly learning ham will correct this, and a sane AWL
database will help with future mail, too. If you keep your whitelist,
you'll easily get the score down below 0.

Also, you should consider LARTing the sender to use a proper MUA. Or, if
you run into rules like MIME_HTML_ONLY frequently, adjust the score
locally to better cope with your particular mail.


Now, if someone please could translate Donis reply... ;)

  guenther


-- 
char *t="[EMAIL PROTECTED]";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to