In message <[EMAIL PROTECTED]>, Robert Menschel writes:
>Hello Loren, Mario,
>Wednesday, August 25, 2004, 12:39:23 PM, Loren wrote:
>LW> The specific rule you  asked for would be written as
>LW> header SUB_UNDERSCORES    Subject =~ /__/
>LW> score    SUB_UNDERSCORES    0.1
>LW> But don't use it, or at least not with any significant score.
>Well, actually, a quick scan of my corpus, 24k ham and 46k spam, shows 40
>spam hits and no ham hits. IMO that could warrant a SARE score as high as
>0.777 (my email client often gives different results than mass-check
>does, so don't take this as gospel). Expect to see this in my next SARE
>mass-check request, so we can see if it works on other corpora.

I would advice against it. At least one big free email provider
(yahoo.se, not sure about the rest of yahoo) will produce this kind of
subject when you send quoted-printable encoded headers to and from it,
due to a buggy QP-encoding.

Essentially, if there's a space before the word with the QP-encoded
letter in it, it erroneously adds one extra `_'.

This eventually leads to subject like these:
Subject: Re: Som man bäddar, _____________________får man ligga...

//Christer

-- 
| Tellusgatan 54    | Telefon: Hem 031 - 42 52 03     CTH: 031 - 772 5431     |
| 415 19 Göteborg   | Epost:   [EMAIL PROTECTED]  Nalle: +46 (0)707 535757  |
|                   | WWW:     http://www.cd.chalmers.se/~mort/               |
"An NT server can be run by an idiot, and usually is." -- Tom Holub, a.h.b-o-i


Reply via email to