On Tue, Mar 05, 2002 at 12:48:53PM -0800, Daniel Quinlan wrote:
| Matthew Cline <[EMAIL PROTECTED]> writes:
| 
| > For those of you who find that English-centricity helps to filter spam,
| > here's a rule that looks for non-ASCII encoding in the subject line:
| >
| > header   NON_ASCII_ENC_SUBJ     Subject =~ /=\?(?:euc-kr|big5|iso-8859-1)\?/
| > describe NON_ASCII_ENC_SUBJ     Non-ASCII encoded subject
| >
| > It just does EUC Korean, Big5 Chinese and ISO Western encodings now,
| > but it's easy enough to add other encodings.
| 
| Actually, iso-8859-1 is for English.

It is for Western Europe.  US-ASCII is a proper subset of all the ISO
and UTF-8.

| Also, some non-spam mail programs unwittingly use iso-8859-1
| encoding in the Subject: line for plain old ASCII.

Though I agree with your point that most english-only stuff uses
iso-8859-1 anyways.

| This US/English-specific approach is fundamentally broken.  spamassassin
| should be able to figure out the predominant MIME encoding of emails and
| score uncommon ones differently.

Right.  See CHARSET_FARAWAY for a starting point.

-D

-- 

How great is the love the Father has lavished on us,
that we should be called children of God!
        1 John 3:1 


_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to