On December 11, 2003 09:10 am, Adam Denenberg wrote:
> SA List,
>
>   I am writing for feedback about a new project i would like to start
> and would love feedback/help from the excellent community that has been
> built here on the SA lists.
>
>  What i want to start is a Bayes Corpus Project.  I would like to be
> able to allow people to submit confirmed ham and/or spam to a large
> bayes corpus repository (or maybe just spam)  where people could then
> download (or somehow do an sa-learn remotely) to an ongoing updated
> bayes corpus.
>
>   Obviously the corpus will need to be monitored, and confirmed either
> ham or spam before being validated and allowed into the repository.  I
> have the hosting and bandwidth to accomodate such a project and I am
> willing to put forth the effort to get it going.  What i was really
> hoping for is some good feedback as to whether the users here think they
> would benefit and use such a project.  Bayes is the one thing spammers
> cant control and try to "beat" SA on their own, so if we all had a very
> large accurate spam corpus we would be in a serious advantage point.   I
> think the benefit of Bayes speaks for itself based on the posts on this
> list (and personal use).

How do you propose to solve the problem of "one man's spam is another's ham"?  
While a bayesian network properly trained by a single user is highly 
effective, one cannot simply copy that bayesian network for use of another 
user and expect the same effectiveness.

Furthermore, won't a public Bayes Corpus also require significant (and 
representative) amount of ham?  Even after the ham has been tokenized...  
This raises the questions of privacy and various cans of worms ...

Pedro

-- 
Admiration, n.:
        Our polite recognition of another's resemblance to ourselves.
                -- Ambrose Bierce, "The Devil's Dictionary"


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to