So - what ever happened to this project? Was it finished?
decoder wrote:
LuKreme wrote:
I don't see any need for the model to be dynamic. Periodic
recalculation of it should be just fine. I bet even daily
reprocessing will prove to be over zealous. Weekly, perhaps even
monthly.
This is wha
On 15-Mar-2009, at 02:29, decoder wrote:
I'm thinking that FPs and FNs are bayes problem anyway. This tool
need to concentrate on seeing just what rules hit and building off
that. I'd go so far to say that as far as SVM is concerned, there
is no such thing as a false postive or negative.
decoder wrote:
LuKreme wrote:
This is an excellent idea, but it also needs rule hits on ham, right?
You're right if you're saying that the method would work better if
there were more ham rules. From what I have seen in my experiments
however, the results are also very precise with the curr
LuKreme wrote:
I don't see any need for the model to be dynamic. Periodic
recalculation of it should be just fine. I bet even daily
reprocessing will prove to be over zealous. Weekly, perhaps even monthly.
This is what I think as well :)
I'm thinking that FPs and FNs are bayes problem anywa
LuKreme wrote:
This is an excellent idea, but it also needs rule hits on ham, right?
You're right if you're saying that the method would work better if there
were more ham rules. From what I have seen in my experiments however,
the results are also very precise with the current SA ruleset. Bu
On 13-Mar-2009, at 21:21, decoder wrote:
John Hardin wrote:
If you want it to be dynamical, then the plugin could do the
appending. However, the model cannot be extended, that means to
incorporate new lines, the whole model must be recalculated. So this
can't be done per message but only ma
On 13-Mar-2009, at 15:24, John Hardin wrote:
On Fri, 13 Mar 2009, decoder wrote:
You create one model file once by feeding it a large corpus of ham
+spam.
The problem is that feeding does not work with an SVM algorithm.
You have to train on the _whole_ set _always_, so feeding mails is
un
On Fri, Mar 13, 2009 at 21:24, John Hardin wrote:
> On Fri, 13 Mar 2009, decoder wrote:
>
>> You create one model file once by feeding it a large corpus of ham+spam.
>
>> The problem is that feeding does not work with an SVM algorithm. You have
>> to train on the _whole_ set _always_, so feeding m
John Hardin wrote:
It needs the score, and not just Y/N Spam/Ham (i.e. from which corpa
file it came)?
The SVM does not need the score. However, the evaluation tool needs the
score because it uses it to calculate FP/FN rate.
I was thinking you'd generate a ham file and a spam file from the l
On Fri, 13 Mar 2009, decoder wrote:
John Hardin wrote:
BAYES_99,FORGED_RCVD_HELO,L_SOME_STD_PROBS,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E4_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,RBL_PSBL_01,RCVD_IN_BRBL,RCVD_IN_NJABL_SPAM,SARE_FROM_SPAM_MONEY2,STOX_30,URIBL_BLACK,URIBL_JP_SURBL,URIB
John Hardin wrote:
I assume it learns from full message corpa? And all it cares about is
the rules that hit?
Per my earlier suggestion of learning off the logs + corpa to fix
FP/FN, could there be an option to learn off generated minimal corpa
files, with their structure being just the rule
I'm going to bet that there will be static meta rules that will be
discovered that can be just added to spamassassin. I'm interested in how
this plays out. I'm very optimistic.
On Fri, 13 Mar 2009, decoder wrote:
You create one model file once by feeding it a large corpus of ham+spam.
The problem is that feeding does not work with an SVM algorithm. You
have to train on the _whole_ set _always_, so feeding mails is
unpractical.
That's why you do this process _once
AlexB wrote:
Chris
From the README its not quite clear: will this work in "autolearn" ?
If you mean that the plugin can automatically learn with the autolearn
setting, answer is no.
would it be enough to create the model.* files or is it a must to feed
it?
You create one model file once by f
This sounds like excellent work. Please do keep us informed about release
to the public and such.
On Fri, Mar 13, 2009 at 8:33 AM, decoder wrote:
> Hi all,
>
>
> as a result of the recent "2+2 != 4" discussion on the list, here is a new
> plugin, which tries to learn ham/spam classification onl
Hi all,
as a result of the recent "2+2 != 4" discussion on the list, here is a
new plugin, which tries to learn ham/spam classification only by knowing
which rules triggered and which did not. This is, so to say, an
automatic meta rule.
The plugin is currently experimental and can only be c
16 matches
Mail list logo