Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-14 Thread Daniel Quinlan
Craig R Hughes writes: > 2.30 coming Real Real Real Soon now! I'm just working through some > last second packaging issues. Please create a 2.3 branch before you release 2.30! It will make it much easier for us to release a 2.31 (maintenance release with critical fixes) if we can do it from th

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-14 Thread Craig R Hughes
2.30 coming Real Real Real Soon now! I'm just working through some last second packaging issues. C Matt Sergeant wrote: MS> Daniel Quinlan wrote: MS> > Michael Moncur <[EMAIL PROTECTED]> writes: MS> > MS> > MS> >>I think the problem is simple: We have a SUBJ_FULL_OF_8BITS rule for 8-bit MS> >>

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-14 Thread Daniel Quinlan
Matt Sergeant writes: > Why don't we just branch SA for 2.30 now so that any HEAD checkins > don't go into this release unless they're urgent (in which case they > can be merged across) ? Good question! I'm in favor of branching. It lets developers make forward progress on HEAD and you can let

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-14 Thread Matt Sergeant
Daniel Quinlan wrote: > Michael Moncur <[EMAIL PROTECTED]> writes: > > >>I think the problem is simple: We have a SUBJ_FULL_OF_8BITS rule for 8-bit >>subjects. People can score it however they want. The unexpected thing is >>that every 8-bit subject also matches the SUBJ_ALL_CAPS rule, which it

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-13 Thread Daniel Quinlan
Michael Moncur <[EMAIL PROTECTED]> writes: > I think the problem is simple: We have a SUBJ_FULL_OF_8BITS rule for 8-bit > subjects. People can score it however they want. The unexpected thing is > that every 8-bit subject also matches the SUBJ_ALL_CAPS rule, which it > shouldn't. I filed a bug r

RE: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-13 Thread Michael Moncur
> The reality of the world doesn't match the RFCs - so we shouldn't > score 8bit > too high. Actually we should (it stops HEAPS of Spam for our ASCII site) - > but maybe it needs to be more obviously configurable? I think the problem is simple: We have a SUBJ_FULL_OF_8BITS rule for 8-bit subjects

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-13 Thread Jason Haar
More of a FYI, but just in case you don't know, but there are LOTS of mailers out there in Europe/Asia where they DO send 8bit chars in the headers. Yes, they know it's not a good idea (due to lack of charset info), but when 90% of your mail is within the same site/country, you'll get away with i

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-13 Thread Vivek Khera
> "DQ" == Daniel Quinlan <[EMAIL PROTECTED]> writes: DQ> As I received it, the email that Arcady posted had 8-bit characters, DQ> but they were safely encoded in quoted-printable. DQ> Subject: =?koi8-r?b?18HX2cHX2Q==?= DQ> However, I checked the version of the rule in CVS and it seems to

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-13 Thread Daniel Quinlan
Daniel Quinlan writes: >> If Russian email is supposed to have 8-bit characters in the Subject >> line, it seems like a bug to me. Can you file one in Bugzilla? Vivek Khera writes: > 8-bit data in email headers is non-sensical -- there is no context in > which to interpret them. Only 7-bit AS

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-13 Thread Vivek Khera
> "DQ" == Daniel Quinlan <[EMAIL PROTECTED]> writes: DQ> If Russian email is supposed to have 8-bit characters in the Subject DQ> line, it seems like a bug to me. Can you file one in Bugzilla? 8-bit data in email headers is non-sensical -- there is no context in which to interpret them. O

RE: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-13 Thread Michael Moncur
> X-Spam-Status: No, hits=3.8 required=5.0 > tests=SUBJ_ALL_CAPS,SUBJ_FULL_OF_8BITS,AWL version=2.20 > X-Spam-Level: *** I've noticed this too. The trouble, I think, is with SUBJ_ALL_CAPS - it always triggers on an 8-bit subject, which combines with SUBJ_FULL_OF_8BITS to create a larger score tha

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-13 Thread Olivier Nicole
> If Russian email is supposed to have 8-bit characters in the Subject > line, it seems like a bug to me. Can you file one in Bugzilla? > > For now, you can work-around the issue by changing the score for > SUBJ_FULL_OF_8BITS to 0.0 and that will deactivate the rule. That's why the next big cha

Re: [SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-12 Thread Daniel Quinlan
Arcady Genkin writes: > Any message, written in Russian, automatically gets 3.8 hits just by > virtue of having 8-bit chars in the subject line and the body. This > bothers me, because it's awfully close to 5. I have the following in > my .spamassassin/user_prefs: > > ok_locales en ru If

[SAtalk] Cyrillic encoding raises the spam level to 3.8

2002-06-12 Thread Arcady Genkin
Any message, written in Russian, automatically gets 3.8 hits just by virtue of having 8-bit chars in the subject line and the body. This bothers me, because it's awfully close to 5. I have the following in my .spamassassin/user_prefs: ok_locales en ru Here are some relevant headers from a