These "concepts" are sometimes referred as "shingles", and example of shingles usage is e.g. in https://www.google.com/patents/US20140365585
Marc Perkel пишет: > This is a spam filtering trick I'm using but it's not SA, but could be > easily adapted to SA or other filtering systems. I thought I'd share > this for other to use or improve upon. > > Rather that just scan for regex strings it's useful to have a way to > tell what things the message is talking about and reduce those to a > single token that represents a concept. Then the concepts can be > combined to produce rules or fed into Bayes for automatic scoring. > > http://wiki.junkemailfilter.com/index.php/Concept_Parsing_Spam_Filter > > Here's an example of concepts: > > dear stranger > i need your information > offers lots of money > dying of something > worships god > bank account > transfer money > reply to me > trust me > africa > united nations > western union > > > Let me know if you find it useful. > > -- > Marc Perkel - Sales/Support > supp...@junkemailfilter.com > http://www.junkemailfilter.com > Junk Email Filter dot com > 415-992-3400 > > > _______________________________________________ > mailop mailing list > mailop@mailop.org > https://chilli.nosignal.org/cgi-bin/mailman/listinfo/mailop -- Vladimir Dubrovin @Mail.Ru
_______________________________________________ mailop mailing list mailop@mailop.org https://chilli.nosignal.org/cgi-bin/mailman/listinfo/mailop