[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> I am wondering is there some where that has hundreds or thousands of
> spam emails available for download.  So that I can put them in a
> directory and train SA?
> 
> Or maybe I am headed in the wrong direction?

I think you are headed in the wrong direction.

Spam Assassin already has hundreds of heuristic rules that it will use
to catch spam.  The bayes classifier is simply an additional method, and
its purpose is to try to determine what YOUR spam (and ham) look like.

If you train the database using someone else's spam, then the bayes
classifier may come to some incorrect conclusions.

For instance, if you were to train bayes using your own inbox for Ham,
and a global database for Spam, the bayes classifier would almost
certainly notice some things in the headers, such as that all mail that
passes through your local mail relay is Ham, and Spam never passes
through that relay.  Conclusion:  Anything passing through the local
relay is Ham!  That is an incorrect conclusion, but makes sense when you
realize how you trained the system.

Judging by the amount of spam being generated these days, it seems
strange to me that you'd be in a hurry to get ahold of some.  A few
weeks is all you need to get yourself a couple hundred spams.  :)

-- 
   [EMAIL PROTECTED] (Fuzzy Fox)     || "Good judgment comes from experience.
sometimes known as David DeSimone  ||  Experience comes from bad judgment."


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to