-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Mike,
> > dcarrera ~$ # The following step takes about 1.5 hours. > > dcarrera ~$ sa-learn --spam --showdots --mbox spam_borrowed_10 > > ........... > > Learned from 11 messages. > > dcarrera ~$ grep 'Subject: ' spam_borrowed_10 | wc -l > > 6972 > > dcarrera ~$ > > How many of the input spam messages had it been fed as spam earlier? > ISTR someone mentioning that SA keys on Message-ID, and if you're > using spam from an archive to supplement spam you received at your > installation there might well be some overlap. I don't expect very many. My original corpus only had about 200 messages. However, this batch actually is from an archive. I tried counting the message count again. This time usint "From " instead of "Subject: ", and it seems that this batch actually had 691 messages. I'm still surprised that there was enough repetition that there were only 11 new messages. Well, I tried with aother batch and the results were better. ~450 new messages in a file that contained about 500. So that certainly looks reasonable. Perhaps this was just a very unusual collection of spam. Thanks. - -- Daniel Carrera | OpenPGP fingerprint: Mathematics Dept. | 6643 8C8B 3522 66CB D16C D779 2FDD 7DAC 9AF7 7A88 UMD, College Park | http://www.math.umd.edu/~dcarrera/pgp.html -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (SunOS) iD8DBQE/HusynxE8DWHf+OcRAq9PAJ4lq8GLo5lY3va0WQjE0WTchmuJRgCffN8K A6GYU1XbWdAr7+ue5XeF9xM= =bw9j -----END PGP SIGNATURE----- ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk