Yes, it works. I run sa-learn with -D, and it shows that it parses correct
Received headers. So I can learn my whole Spam folder if I want!

...
debug: tokenize: header tokens for *F = "Tam Passier <[EMAIL PROTECTED]>"
debug: tokenize: header tokens for To = "<[EMAIL PROTECTED]>"
debug: tokenize: header tokens for Mime-Version = "1.0"
debug: tokenize: header tokens for *c = "/html"
debug: tokenize: header tokens for *m = " vynlwbkrbjqfl mailmij nl "
debug: tokenize: header tokens for *r = "  mailmij.nl ([218.37.74]) by
fe04.mail
.jippii.net (8.12.9/8.12.1)         <[EMAIL PROTECTED]>; "
debug: tokenize: header tokens for *r = "  mailmij.nl ([218.37.74]) by
fe04.mail
.jippii.net (8.12.9/8.12.1)         <[EMAIL PROTECTED]>;    fe04.mail.jippii.net
(fe
04.mail.jippii.net [195.197.172]) by be2.mail.jippii.net (Postfix)
<fuer
[EMAIL PROTECTED]>; "
Learned from 1 messages.
debug: synced Bayes databases from journal in 1 seconds: 914 unique entries
(236
9 total entries)
...

----- Original Message ----- 
From: "Dean Gallea" <[EMAIL PROTECTED]>
To: "'Harri Pesonen'" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Monday, August 11, 2003 4:00 PM
Subject: RE: [SAtalk] Re: [Spamassassin-saproxy] Outlook Add-In for
SpamAssassin is here


I guess so. According to the FAQ, the X-Spam-Flag: YES should tell SA-learn
to remove the SA report before learning. But this is a minimal message: the
body is just a URL. No words to learn.

-- Dean

Coffee (n.), a person who is coughed upon.




-----Original Message-----
From: Harri Pesonen [mailto:[EMAIL PROTECTED]
Sent: Monday, August 11, 2003 1:44 AM
To: Dean Gallea; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: [SAtalk] Re: [Spamassassin-saproxy] Outlook Add-In for
SpamAssassin is here


The problem here is that it says ' "real" message encapsulated as a
message/rfc822 MIME part' that is unclear to me. Is it an attachment or not?
I guess it is, but it does not mention the body text that has now content
analysis. I guess that it does not matter, it just gets the attachment (that
is the original message).

I tested it now. Here is one spam converted to mbox with DbxConv, is this OK
to feed to sa-learn?

>From [EMAIL PROTECTED] Fri Jan 10 22:40:04 2003
Received: from localhost [127.0.0.1] by skywalker
 with SpamAssassin (2.55 1.174.2.19-2003-05-19-exp);
 su, 10 elo 2003 22:40:04 +0300
From: Tam Passier <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Subject: [Spam] Get it discreetly
Date: Sat, 09 Aug 2003 23:34:48 -0400
Message-Id: <[EMAIL PROTECTED]>
X-Spam-Flag: YES
X-Spam-Status: Yes, hits=5.8 required=5.0
tests=BAYES_90,DATE_IN_PAST_12_24,HTML_70_80,HTML_IMAGE_ONLY_02,
       HTML_MESSAGE,MIME_HTML_ONLY
 version=2.55
X-Spam-Level: *****
X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="----------=_3F369F94.D59A0000"

This is a multi-part message in MIME format.

------------=_3F369F94.D59A0000
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

This mail is probably spam.  The original message has been attached along
with this report, so you can recognize or block similar unwanted mail in
future.  See http://spamassassin.org/tag/ for more details.

Content preview:  fuerte URI:http://www.med12z.com/pa
URI:http://www.7x24pharm1.com/ra.gif URI:http://www.med12z.com/page/a.html
easy exit [...]

Content analysis details:   (5.80 points, 5 required)
HTML_70_80         (0.3 points)  BODY: Message is 70% to 80% HTML
BAYES_90           (4.0 points)  BODY: Bayesian classifier says spam
probability is 90 to 99%
                   [score: 0.9519]
HTML_MESSAGE       (0.0 points)  BODY: HTML included in message
HTML_IMAGE_ONLY_02 (1.4 points)  BODY: HTML has images with 0-200 bytes of
words DATE_IN_PAST_12_24 (0.0 points)  Date: is 12 to 24 hours before
Received: date
MIME_HTML_ONLY     (0.1 points)  Message only has text/html MIME parts

The original message did not contain plain text, and may be unsafe to open
with some email clients; in particular, it may contain a virus, or confirm
that your address can receive spam.  If you wish to view it, it may be safer
to save it to a file and open it with an editor.


------------=_3F369F94.D59A0000
Content-Type: message/rfc822; x-spam-type=original
Content-Description: original message before SpamAssassin
Content-Disposition: attachment
Content-Transfer-Encoding: 8bit

Received: via tmail-2003a for fuerte.0; Sun, 10 Aug 2003 22:36:27 +0300
(EEST)
Received: from fe04.mail.jippii.net (fe04.mail.jippii.net [195.197.172.102])
by be2.mail.jippii.net (Postfix) with ESMTP id CA4C0916  for
<[EMAIL PROTECTED]>; Sun, 10 Aug 2003 22:36:27 +0300 (EEST)
Received: from mailmij.nl ([218.37.74.195])
 by fe04.mail.jippii.net (8.12.9/8.12.1) with SMTP id h7AJaNSt014976  for
<[EMAIL PROTECTED]>; Sun, 10 Aug 2003 22:36:25 +0300 (EEST)
From: Tam Passier <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Subject: Get it discreetly
Date: Sat, 09 Aug 2003 23:34:48 -0400
Mime-Version: 1.0
Content-Type: text/html
Message-Id: <[EMAIL PROTECTED]>
Status:

<html><body>fuerte
<p><a href="http://www.med12z.com/pa";><img
border="0" src="http://www.7x24pharm1.com/ra.gif";></a></p>
<a href="http://www.med12z.com/page/a.html";>easy exit</a> </body></html>


------------=_3F369F94.D59A0000--
----- Original Message ----- 
From: "Dean Gallea" <[EMAIL PROTECTED]>
To: "'Harri Pesonen'" <[EMAIL PROTECTED]>
Sent: Monday, August 11, 2003 12:21 AM
Subject: RE: [SAtalk] Re: [Spamassassin-saproxy] Outlook Add-In for
SpamAssassin is here


> Harri,
>
> Because of this:
> http://spamassassin.taint.org/faq/index.cgi?req=show&file=faq05.002.ht
> p
>
> -- Dean
>
> If rabbits' feet are so lucky, what happened to the rabbit?
>
>
>
>
> -----Original Message-----
> From: Harri Pesonen [mailto:[EMAIL PROTECTED]
> Sent: Sunday, August 10, 2003 4:59 PM
> To: Dean Gallea; 'Mike Burger'
> Cc: [EMAIL PROTECTED];
> [EMAIL PROTECTED]
> Subject: Re: [SAtalk] Re: [Spamassassin-saproxy] Outlook Add-In for
> SpamAssassin is here
>
>
> I would say A3 = YES (with SpamAddIn at least, if the message has
> internet headers), but how can B1 and B2 be YES or MAYBE? This is news
> to me.
>
> ----- Original Message -----
> From: "Dean Gallea" <[EMAIL PROTECTED]>
> To: "'Harri Pesonen'" <[EMAIL PROTECTED]>; "'Mike Burger'"
> <[EMAIL PROTECTED]>
> Cc: <[EMAIL PROTECTED]>;
> <[EMAIL PROTECTED]>
> Sent: Sunday, August 10, 2003 11:48 PM
> Subject: RE: [SAtalk] Re: [Spamassassin-saproxy] Outlook Add-In for
> SpamAssassin is here
>
>
> > Wow, too many assumptions left unsaid. Let's try to clarify:
> >
> > Scenario A: User sets report_safe 0 (SAProxy's "Low Safety") = Leave
> > original spam messages alone, put report in headers Scenario B: User
> > sets report_safe 1 (SAProxy's "Normal Safety" default) = encapsulate
> > original spam messages and put report in body Mail reader 1: Non-MS
> > (such as Eudora, Mozilla), messages in an "official" mbox file, with
> > MIME attachments as per the RFC spec Mail reader 2: Outlook Express,
> > spam messages converted using third-party utility to mbox format
> > with structure/attachments intact per RFC Mail reader 3: Outlook,
> > spam messages converted using to mbox format using third-party
> > utility with structure altered to not comply with RFC
> >
> > The answers to the question "can SA-learn properly learn my spam",
> > as I understand them, are as follows:
> >
> > A1, A2, B1 = YES
> > B2 = Maybe (Can anyone confirm that messages so converted get
> > properly
> > learned?)
> > A3, B3 = NO
> >
> > -- Dean
> >
> > -----Original Message-----
> > From: Harri Pesonen [mailto:[EMAIL PROTECTED]
> > Sent: Sunday, August 10, 2003 1:24 PM
> > To: Mike Burger
> > Cc: Dan Wing; Theo Van Dinter; Dean Gallea;
> > [EMAIL PROTECTED];
> > [EMAIL PROTECTED]
> > Subject: Re: [SAtalk] Re: [Spamassassin-saproxy] Outlook Add-In for
> > SpamAssassin is here
> >
> >
> > Okay, this is now clear. And the answer to my question is "no".
> >
> > ----- Original Message -----
> > From: "Mike Burger" <[EMAIL PROTECTED]>
> > To: "Harri Pesonen" <[EMAIL PROTECTED]>
> > Cc: "Dan Wing" <[EMAIL PROTECTED]>; "Theo Van Dinter"
> > <[EMAIL PROTECTED]>; "Dean Gallea" <[EMAIL PROTECTED]>;
> > <[EMAIL PROTECTED]>;
> > <[EMAIL PROTECTED]>
> > Sent: Sunday, August 10, 2003 8:21 PM
> > Subject: Re: [SAtalk] Re: [Spamassassin-saproxy] Outlook Add-In for
> > SpamAssassin is here
> >
> >
> > > If SpamAssassin is set to 'report_safe 0", the entire spam report,
> > > including the analysis, is in the header.
> > >
> > > On Sun, 10 Aug 2003, Harri Pesonen wrote:
> > >
> > > > But if the original message is in attachment, spam status in
> > > > headers,
> > and
> > > > content analysis in message body text, does sa-learn handle it
> > correctly?
> > > >
> >
> >
>
>
>





-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to