Say what?

jdow Mon, 04 Dec 2006 01:56:55 -0800

I have two copies of the same message content and source sent two
minutes apart. These are the only differences in the messages as
I trimmed out the various verification data and differing times.


===8<---
$ diff first second
0a1

Status:  U

6c7
<       by mx-avoceta.atl.sa.earthlink.net (EarthLink SMTP Server) with SMTP id
---

      by mx-jacana.atl.sa.earthlink.net (EarthLink SMTP Server) with SMTP id

9c10
<       by smtpout02.lax.untd.com with SMTP id
---

      by smtpout01.lax.untd.com with SMTP id

72a74,76

===8<---

Of course the various "id" strings all differ as well.

The first message scored Bayes 80. The second scored Bayes 95. This
implies that Bayes is training itself on garbage as well as message
content.

Since other sources of filtering deal with the Received: lines and the
message header id lines should Bayes be paying any attention to them, too?

Should an id string like L8QWHGMP or an X-UNTD-OriginStamp line such as
below figure into the Bayes algorithm at all?

X-UNTD-OriginStamp: qTKGdH6+6PX6q6wVyyDAiKpzgjuM3gNrL/xEOWaR9Ko1VNgBJE6wCw==
R

{^_^}

Say what?

Reply via email to