>From: Paul Stead <paul.st...@zeninternet.co.uk>
>Sent: Tuesday, May 24, 2016 9:55 AM
>To: users@spamassassin.apache.org
>Subject: SA Concepts - plugin for email semantics

>Hi guys,

>Based upon some information from others on the list I have put together
>a plugin for SA which canonicalises an email into it's basic "concepts".
>Concepts are converted to tags, which Bayes can use as tokens to further
>help identify spammy/hammy characteristics

>Here are some examples of tags from some emails today -

>---8<---
>X-SA-Concepts: experience regards money optout time-ref dear great home
>request member enjoy woman-adj important online click all-rights
>email-adr please price best hot-adj
>X-SA-Concepts: experience contact optout winner time-ref survey dear
>home privacy prize store thankyou important click gift chance please
>X-SA-Concepts: google law search-eng optout amazing order facebook
>goodtime privacy lotsofmoney request enjoy details service partner
>linkedin twitter trust contact time-ref great online click shop
>email-adr please customer newsletter news
>X-SA-Concepts: photos view-online money contact optout time-ref cost
>reply2me service details online click please
>X-SA-Concepts: friend hotwords trust experience regards contact time-ref
>medical woman drugs consultant pill mailto woman-adj secret health earn
>email-adr please security hot-adj day-of-week
>X-SA-Concepts: https mailto re euros regards money youtube invoice
>email-adr facebook best hair
>---8<---

>This plugin essentially adds an extra layer between the raw input
>characteristics and recognition types - allowing clustering of different
>characteristics to a more generic type - in effect giving Bayes more of
>a two-layer neural network approach.

>When combined with Bayes learning these email semantics (or Concepts)
>can then be combined with the multiple other characteristics of that
>email, to then be compared to other email that came before it.

>https://github.com/fmbla/spamassassin-concepts

>I'd be really interested to hear your feedback/thoughts on this system
>and it's approach.

>Paul

Good idea.  I would like to test this out so I put this on my CentOS 6 servers
(perl  v5.10.1) and got this:

May 24 10:59:51.850 [30158] warn: plugin: failed to parse plugin 
/etc/mail/spamassassin/Concepts.pm: Type of arg 1 to push must be array (not 
private variable) at /etc/mail/spamassassin/Concepts.pm line 84, near "$headl;"
May 24 10:59:51.850 [30158] warn: Type of arg 1 to push must be array (not 
private variable) at /etc/mail/spamassassin/Concepts.pm line 88, near ");"
May 24 10:59:51.850 [30158] warn: Type of arg 1 to keys must be hash (not hash 
element) at /etc/mail/spamassassin/Concepts.pm line 93, near "}) "
May 24 10:59:51.850 [30158] warn: Type of arg 1 to keys must be hash (not 
private variable) at /etc/mail/spamassassin/Concepts.pm line 104, near 
"$matched_concepts) "
May 24 10:59:51.850 [30158] warn: Type of arg 1 to push must be array (not hash 
element) at /etc/mail/spamassassin/Concepts.pm line 168, near "$re if"
May 24 10:59:51.850 [30158] warn: Type of arg 1 to keys must be hash (not 
private variable) at /etc/mail/spamassassin/Concepts.pm line 174, near 
"$concepts;"
May 24 10:59:51.850 [30158] warn: Compilation failed in require at 
/usr/share/perl5/vendor_perl/Mail/SpamAssassin/PluginHandler.pm line 109.
May 24 10:59:52.472 [30158] warn: config: failed to parse line, skipping, in 
"/etc/mail/spamassassin/41_concepts.cf": concepts_dir 
/etc/mail/spamassassin/concepts
May 24 10:59:52.472 [30158] warn: Unrecognized escape \l passed through in 
regex; marked by <-- HERE in m/\l <-- HERE otsofmoney\b/ at 
/usr/share/perl5/vendor_perl/Mail/SpamAssassin/Conf/Parser.pm line 1388.
May 24 10:59:54.646 [30158] warn: lint: 1 issues detected, please rerun with 
debug enabled for more information

Thanks for sharing your code and time you put into this,
Dave

Reply via email to