Re: Experimental Plugin: MetaSVM

2009-09-17 Thread Marc Perkel
So - what ever happened to this project? Was it finished? decoder wrote: LuKreme wrote: I don't see any need for the model to be dynamic. Periodic recalculation of it should be just fine. I bet even daily reprocessing will prove to be over zealous. Weekly, perhaps even monthly. This is wha

Re: Experimental Plugin: MetaSVM

2009-03-15 Thread LuKreme
On 15-Mar-2009, at 02:29, decoder wrote: I'm thinking that FPs and FNs are bayes problem anyway. This tool need to concentrate on seeing just what rules hit and building off that. I'd go so far to say that as far as SVM is concerned, there is no such thing as a false postive or negative.

Re: Experimental Plugin: MetaSVM

2009-03-15 Thread Marc Perkel
decoder wrote: LuKreme wrote: This is an excellent idea, but it also needs rule hits on ham, right? You're right if you're saying that the method would work better if there were more ham rules. From what I have seen in my experiments however, the results are also very precise with the curr

Re: Experimental Plugin: MetaSVM

2009-03-15 Thread decoder
LuKreme wrote: I don't see any need for the model to be dynamic. Periodic recalculation of it should be just fine. I bet even daily reprocessing will prove to be over zealous. Weekly, perhaps even monthly. This is what I think as well :) I'm thinking that FPs and FNs are bayes problem anywa

Re: Experimental Plugin: MetaSVM

2009-03-15 Thread decoder
LuKreme wrote: This is an excellent idea, but it also needs rule hits on ham, right? You're right if you're saying that the method would work better if there were more ham rules. From what I have seen in my experiments however, the results are also very precise with the current SA ruleset. Bu

Re: Experimental Plugin: MetaSVM

2009-03-14 Thread LuKreme
On 13-Mar-2009, at 21:21, decoder wrote: John Hardin wrote: If you want it to be dynamical, then the plugin could do the appending. However, the model cannot be extended, that means to incorporate new lines, the whole model must be recalculated. So this can't be done per message but only ma

Re: Experimental Plugin: MetaSVM

2009-03-14 Thread LuKreme
On 13-Mar-2009, at 15:24, John Hardin wrote: On Fri, 13 Mar 2009, decoder wrote: You create one model file once by feeding it a large corpus of ham +spam. The problem is that feeding does not work with an SVM algorithm. You have to train on the _whole_ set _always_, so feeding mails is un

Re: Experimental Plugin: MetaSVM

2009-03-14 Thread Justin Mason
On Fri, Mar 13, 2009 at 21:24, John Hardin wrote: > On Fri, 13 Mar 2009, decoder wrote: > >> You create one model file once by feeding it a large corpus of ham+spam. > >> The problem is that feeding does not work with an SVM algorithm. You have >> to train on the _whole_ set _always_, so feeding m

Re: Experimental Plugin: MetaSVM

2009-03-13 Thread decoder
John Hardin wrote: It needs the score, and not just Y/N Spam/Ham (i.e. from which corpa file it came)? The SVM does not need the score. However, the evaluation tool needs the score because it uses it to calculate FP/FN rate. I was thinking you'd generate a ham file and a spam file from the l

Re: Experimental Plugin: MetaSVM

2009-03-13 Thread John Hardin
On Fri, 13 Mar 2009, decoder wrote: John Hardin wrote: BAYES_99,FORGED_RCVD_HELO,L_SOME_STD_PROBS,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E4_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,RBL_PSBL_01,RCVD_IN_BRBL,RCVD_IN_NJABL_SPAM,SARE_FROM_SPAM_MONEY2,STOX_30,URIBL_BLACK,URIBL_JP_SURBL,URIB

Re: Experimental Plugin: MetaSVM

2009-03-13 Thread decoder
John Hardin wrote: I assume it learns from full message corpa? And all it cares about is the rules that hit? Per my earlier suggestion of learning off the logs + corpa to fix FP/FN, could there be an option to learn off generated minimal corpa files, with their structure being just the rule

Re: Experimental Plugin: MetaSVM

2009-03-13 Thread Marc Perkel
I'm going to bet that there will be static meta rules that will be discovered that can be just added to spamassassin. I'm interested in how this plays out. I'm very optimistic.

Re: Experimental Plugin: MetaSVM

2009-03-13 Thread John Hardin
On Fri, 13 Mar 2009, decoder wrote: You create one model file once by feeding it a large corpus of ham+spam. The problem is that feeding does not work with an SVM algorithm. You have to train on the _whole_ set _always_, so feeding mails is unpractical. That's why you do this process _once

Re: Experimental Plugin: MetaSVM

2009-03-13 Thread decoder
AlexB wrote: Chris From the README its not quite clear: will this work in "autolearn" ? If you mean that the plugin can automatically learn with the autolearn setting, answer is no. would it be enough to create the model.* files or is it a must to feed it? You create one model file once by f

Re: Experimental Plugin: MetaSVM

2009-03-13 Thread BChasm
This sounds like excellent work. Please do keep us informed about release to the public and such. On Fri, Mar 13, 2009 at 8:33 AM, decoder wrote: > Hi all, > > > as a result of the recent "2+2 != 4" discussion on the list, here is a new > plugin, which tries to learn ham/spam classification onl

Experimental Plugin: MetaSVM

2009-03-13 Thread decoder
Hi all, as a result of the recent "2+2 != 4" discussion on the list, here is a new plugin, which tries to learn ham/spam classification only by knowing which rules triggered and which did not. This is, so to say, an automatic meta rule. The plugin is currently experimental and can only be c