Gary Funck wrote: > > > From: Robert Menschel > Here's an idea that I've been considering for a while: have SA change its > scoring strategy to use a Neural Net, instead of using the strictly additive > scoring. SA would still use its custom rules to detect spam markers, but it
Just a couple comments that may or may not be on the same page... It is an idea that sort of captures the imagination. Though, isn't that very similar to what's already being done with the GA, as well as Bayes? The GA sets scores according to what turns out to be most accurate - least FPs, FNs. Bayes anaylizes selected email tokens to this end. What would you feed the nueral net for learning? That is, my first inclination is to train just like Bayes is trained. An NN would be another approach, though similar to Bayes, it might have advantages, flexibility, intelligence aside from token/probability analysis -- as you mention below, it may be difficult to understand the effect... implying, at least on the advantages side of it, that an NN could possibly discover relationships, etcetera for detecting spam that might otherwise go undetected. Bayes works great. One might think an NN would do even better. Bryan > would let the NN do the scoring. The advantages of neural nets are that they > are > generally good at descriminating binary categories -- after training the NN > can choose to give some attributes a single weighting that would drive the > decision of spam/ham, or they can combine the weights of verious attributes. > A disadvantage, or perhaps an advantage, is that it is difficult to > understand > the effect that a single factor would have on the decision process. Another > disadvantage is that it may be difficult to combine the results provided by > the community at large when running a new release's rule set. I think there > is a way to handle this: users would provide a file where each line records > the rules that were hit for each message, and a designation as to whether > this > was ham or spam. All of these sequences would be combined to train the final > neural net. > > I found a few papers describing experiments where a Neural Net was used > to make spam/ham decisions using very simple criteria. The results were > good, > but not impressive, however, I felt that the test was oversimplified. I > think it > would be "interesting" to train a Neural Net using the various features that > SA detects. I doubt that such an approach will gain favor in a production > release > of SA, but I can see where it might be useful in a localized context. > > ------------------------------------------------------- > This SF.net email is sponsored by: Perforce Software. > Perforce is the Fast Software Configuration Management System offering > advanced branching capabilities and atomic changes on 50+ platforms. > Free Eval! http://www.perforce.com/perforce/loadprog.html -- I struggle in vain. My foot slips. My life is still a poet's existence. What could be more unhappy? - (Soren Kierkegaard - Either/Or) http://www.wecs.com/content.htm This signature file is generated by Pick-a-Tag ! Written by Jeroen van Vaarsel http://www.google.com/search?hl=en&ie=ISO-8859-1&q=pick-a-tag ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk