-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Mike,


> > dcarrera ~$ # The following step takes about 1.5 hours.
> > dcarrera ~$ sa-learn --spam --showdots --mbox spam_borrowed_10 
> > ...........
> > Learned from 11 messages.
> > dcarrera ~$ grep 'Subject: ' spam_borrowed_10 | wc -l
> >     6972
> > dcarrera ~$
> 
> How many of the input spam messages had it been fed as spam earlier?
> ISTR someone mentioning that SA keys on Message-ID, and if you're
> using spam from an archive to supplement spam you received at your
> installation there might well be some overlap.

I don't expect very many.  My original corpus only had about 200 
messages.  However, this batch actually is from an archive.

I tried counting the message count again.  This time usint "From " instead 
of "Subject: ", and it seems that this batch actually had 691 messages.  
I'm still surprised that there was enough repetition that there were only 
11 new messages.

Well, I tried with aother batch and the results were better.  ~450 new 
messages in a file that contained about 500.  So that certainly looks 
reasonable.  Perhaps this was just a very unusual collection of spam.

Thanks.
- -- 
Daniel Carrera    | OpenPGP fingerprint:
Mathematics Dept. | 6643 8C8B 3522 66CB D16C D779 2FDD 7DAC 9AF7 7A88
UMD, College Park | http://www.math.umd.edu/~dcarrera/pgp.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (SunOS)

iD8DBQE/HusynxE8DWHf+OcRAq9PAJ4lq8GLo5lY3va0WQjE0WTchmuJRgCffN8K
A6GYU1XbWdAr7+ue5XeF9xM=
=bw9j
-----END PGP SIGNATURE-----


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to