On Saturday 08 December 2007 01:15, Karsten Bräckelmann wrote:
<snip>

> > Ok. My fault I mistook charsets with country codes. But replace se with
> > ru or ch or greek7. The result is the same. You want one charset to be
> > considered as "not ham" and you have to give the whole list to the
> > parameter. And I think it is a long and ugly to read list (see:
> > http://www.iana.org/assignments/character-sets)
>
> Yes, that list indeed is ugly. However, that is *not* what we are
> talking about. The list of valid locales for ok_locales can be found in
> the docs -- and totals 6, including en...

Only 6? Yes, I found it in the docs. (Yeah, I know: RTFM before you ask 
around). I appologize, with only 6 charsets it is not useful to have a 
not_ok_locales option.

> > I only want to say that there can be a situation in which you only know
> > that you don't want to consider the XXX charset as an indicator for ham.
>
> Despite its name, ok_locales is *not* about certain charsets being "an
> indicator for ham". The opposite is true. It does not assign a negative
> score. All it does is assigning a positive score for charsets "not in
> the ok list".

Maybe I should have said: "an indicator for NOT spam" ? Sh.., there are too 
many double negations and I'm too tired for that.

> > > Anyway, this whole example is non-realistic as is. As Matt pointed out
> > > in a later post, we are talking character sets here, not languages. In
> > > the world of ok_locales, there is no distinction between en and se,
> > > which is just en to ok_locales...
> >
> > As I say I got confused with it (and be it maybe still).
> >
> > Other question: How does Spamassassin know which charset it should use.
> > Provides it a list of all charsets and compares or does it try it to find
> > the information in the header of the mail or ...?
>
> Unfortunately, I don't know either. Although I'd like to...
>
> As per my counter example above, I do not want CHARSET_FARAWAY and
> friends to score on mail, just because a fellow hacker happens to have
> his original name in his sig or From: header. And it probably doesn't
> come as a surprise, that the example actually is real life. ;)
>
>
> Maybe the devs can briefly explain how the charset is being determined.
> Or at least, where exactly in the code one could find it...
>
>   guenther  - who is too lazy to dig through all the code right now :)

Bye
Stefan

Attachment: pgpYZDQf0pfsw.pgp
Description: PGP signature

Reply via email to