> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Raul
> Dias
> Sent: Saturday, August 02, 2003 2:17 PM
> To: [EMAIL PROTECTED]
> Subject: Re: [SAtalk] separate rules from distribution
>
>
>
> Hi,
>
> Em S?, 2003-08-02 ās 16:49, Florin Andrei escreveu:
>
> > How about separating the rules from the SpamAssassin distribution
> > itself, and offer them as a separate package? This would be similar to
> > the way intrusion detection systems (Snort, etc.) and antivirus
> > applications work.
>
> I think this is a great idea.
> I thought about this some time ago, and wonder about some gotchas in
> doing this:
> 1 - Lots of rules depends on the sa "engine" being used to work, like
>     eval tests.
> 2 - Depending on the type of rules it would have to check the sa
>     "engine"  version at run time to be sure it can be used.

Isn't this handled by the "require_version" directive?


I was thinking of a slightly different scenario.

I'd like to collaborate with others on developing rules to catch the
latest spams. I was thinking that one way to do this would be to have
a cvs tree that is publically accessible (it might live on Sourceforge
under the spamassassin project, or in its own project). All the usual
rules files (10_misc.cf through 60_whitelist.cf) would be there.

The twist is that all users of this CVS tree can modify existing rules
and add their own. Then, whenever a user decides to sync to the latest
CVS tree, they'll get all the latest rules contributed by others. This
is basically the same as syncing to the development cvs tree, except
that rather than restricting rule contributions to just developers
(and there's plenty of good reason for that in the production version),
all users will be able to collaborate in real-time. To make this
somewhat manageable,

1) users would have to submit one/more offending messages matched by their
new rule into a corpus that is kept under CVS, when submitting a new rule,
or changing an existing one.

2) users would have to agree to run their new rule(s) against a "mass check"
of their own corpus, and possibly the main SA corpus before submitting their
changes. They would need to be convinced that their changes cause no new
noticeable regressions (ie, no spikes in false positives or negatives).
[thinking this over, maybe they should be required to check out the latest
corpus kept under CVS, and run against that].

The benefits/risks are obvious. Here's a few potential problems/issues:

- which version of the SA engine is to be used? Can the CVS's rules be kept
under version control and tagged, so that branches are kept for past releases,
with the main branch being the current CVS release of the SA engine?

- how will the scores be updated? My understanding is that rescoring as it
currently stands is almost a week long process, prior to release?

While we're brainstorming ... if re-scoring takes a long time, is there a
way to partition the rescoring tests so that they can be calculated across
a grid of computers (ala SETI)? If so, I'm sure there'd be no lack of
volunteered computing resources.




-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to