On 01/20/16 14:28, John Hardin wrote:
On Wed, 20 Jan 2016, Marc Perkel wrote:

Here's another way to use my evolution filtering idea with SA.

Get rid of all the rule scores and just make a list of the rule names. From the rule names generate all combinations of those rule names up to 4 rule names in a fingerprint and learn those fingerprints as either ham or spam. sort of like this:

“A” “AB” “B” “C” “AC” “ABC” “BC” “D” “AD” “ABD” “BD” “CD” “ACD” “ABCD” “BCD” “E” “AE” “BE” “CE” “ACE” “BCE” “DE” “ADE” “ABDE” “BDE” “CDE” “ACDE” “ABCDE” “BCDE”

Then - when a new message comes in you make the same combo of fingerprints from the rule names and then use my formula.

card(Test intersect Spam diff Ham) - card(Test Intersect Ham diff Spam)

Positive result = spam
Negative result = ham

Unfortunately this also requires training. It would render SA a product that does not work out-of-the-box.


Actually it could include a pretrained corpus on the rules at least to get people started. Could also have someone (like me?) provide it as a service that SA would talk to. SA would send the tokens to the service and the service would return a score.

--
Marc Perkel - Sales/Support
supp...@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400

Reply via email to