On 08/30/2016 11:03 AM, Merijn van den Kroonenberg wrote:


I now realize you asked about SOUGHT while I gave you a bit of SARE
history .

SOUGHT rules were created by Justin Mason, SA's chief dev/inventor for
many years.

They were also independent from the Apache SpamAssassin project and when
he moved on to a new job area, he opted to shutdown the system.
It was not a simple setup or cheap to run.

Many of us learnt a lot from this rule generation method and thankfully
the basic code is in SA's SVN for ppl to glue their own rule generators.

If found something at
http://svn.apache.org/repos/asf/spamassassin/trunk/masses/rule-dev/sought



I run such a rule generator at $dayjob but it's far from being portable.
For me it was a very steep learning curve.. :)
(Thanks JM for helping out, back then)

Axb


Thanks for the info. So generating your own Sought rules can still be
effective. I guess the hardest part is probably the ham/spam corpus you
need for it.

Yes it is, very effective with 419/Phishing etc stuff which may slip by BLs

As it's only about body rules, you can collect ham from bulk mail.
You're only trying to avoid certain commonly used phrases to land in rules. Bulk mail qualifies real well for this.

Spam you can feed from traps or if you have the energy, sift thru spam folders. I use trap data only.

Does your version run unattented or do you need to review your generated
rules?

Totally unattended, every 4 hours. When I think of it, I review a rule file to make sure no html code or unwanted characters are being used in rules and add to "kill_bad_patterns"

Basics you need are in /trunk/masses/rule-dev/
h2h

Axb


Reply via email to