Thanks for this. It'll be useful to show the next person who tries to convince 
me software patents are a good idea. 
Sent from my mobile. Please excuse any unusual brevity or typos while I'm on 
the go. 

> On 22 Jan 2016, at 7:48 AM, Marc Perkel <supp...@junkemailfilter.com> wrote:
> 
> Just to follow up on this. I'm in the process of improving the filter. But I 
> have filed my provisional patent so i'm going to give you an overview of how 
> it works.
> 
> Most spam filters work by matching things. Matching ham and spam. Matching 
> rules. The important point here in this is this new system I'm calling the 
> Evolution filter is about NOT matching.
> 
> Suppose I sent you an email with the subject line "Let's get dinner". You can 
> tell instantly this is good email. How? Because spammers never say "Let's get 
> dinner".
> 
> There are millions of phrases used in good email every day that are never 
> used in spam. And - there are millions of phrases used everyday in spam that 
> are never used in good email. So if I get an email that matches phrases used 
> in good email and never used in spam - it's a good message. And if the 
> messages contains words and phrases used in spam and never used in ham - it's 
> spam.
> 
> So - how do I get a list of all phrases never used in ham or never used in 
> spam? I make a list of all words and phrases used in ham and spam and test to 
> see if it's NOT in the list. To illustrate my point,
> 
> Here is a list of 5505874 words and phrases used in the subject line of HAM 
> and never seen in the subject line of SPAM
> 
> http://www.junkemailfilter.com/data/subject-ham.txt
> 
> Here is a list of 3494938 words and phrases used in the subject line of SPAM 
> and never seen in the subject line of HAM
> 
> http://www.junkemailfilter.com/data/subject-spam.txt
> 
> The thing about not matching is that matching involves finite sets. Not 
> matching involves infinite sets. And infinite sets are always bigger than 
> finite sets.
> 
> Here in a link to my patent.
> 
> http://www.junkemailfilter.com/patent/
> 
> What I intend to do is to give it away to the little guys and charge the big 
> guys a small license fee. The process of implementing this is fairly easy. 
> I'm hoping to encourage the open source world to take this idea and do it 
> right. My code it cobbled together and uses 4 different languages. But the 
> concept is enough to get you going.
> 
> One thing you will need to implement this is Redis. Redis is extremely fast 
> at set comparisons and set comparisons is how this works. It's can be 
> expressed as one formula.
> 
> score = card(SpamCorpus intersect TestMessage diff HamCorpus) - 
> card(HamCorpus intersect TestMessage diff SpamCorpus)
> 
> I'm seeing an accuracy level that is so close to 100% it's scary. It is 
> especially good at actively identifying good email to prevent false positives.
> 
> I will post more soon as it all comes together.
> 
> 
> 
> 
> _______________________________________________
> mailop mailing list
> mailop@mailop.org
> https://chilli.nosignal.org/cgi-bin/mailman/listinfo/mailop

_______________________________________________
mailop mailing list
mailop@mailop.org
https://chilli.nosignal.org/cgi-bin/mailman/listinfo/mailop

Reply via email to