Hi,

The spam corpus I used is actually a set of messages caught
by SA; I don't know how many unique messages there are
(maybe 80%?), but there's more than 12,000 total. So it
won't help SA much, but I'm guessing that it helps to seed
bayes.

The tgz file is 27M in size. I can make it available via ftp
if you'd like.

Ricardo

----- Original Message Follows -----
> I haven't seen one  :)
> 
> I think you have a pretty solid start...  :)
> 
> care to share your SPAM corpus?
> 
> CT
> 
> ----- Original Message ----- 
> From: "Ricardo Kleemann" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Friday, August 15, 2003 10:57 AM
> Subject: [SAtalk] archives for seeding bayes?
> 
> 
> > Hi,
> >
> > I've trained my bayes database with about 12,000 spam
> > and 7,000 ham messages, but I was wondering if there are
> > much larger archives available for seeding bayes?
> >
> > Thanks
> > Ricardo
> >
> >
> > -------------------------------------------------------
> > This SF.Net email sponsored by: Free pre-built ASP.NET
> > sites including Data Reports, E-commerce, Portals, and
> > Forums are available now. Download today and enter to
> win an XBOX or Visual Studio .NET. >
> http://aspnet.click-url.com/go/psa00100003ave/direct
> > ;at.aspnet_072303_01/01
> > _______________________________________________
> > Spamassassin-talk mailing list
> > [EMAIL PROTECTED]
>
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
> 


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to