Kevin A. McGrail created COMDEV-260:
---------------------------------------
Summary: SpamAssassin Bayes Token ID
Key: COMDEV-260
URL: https://issues.apache.org/jira/browse/COMDEV-260
Project: Community Development
Issue Type: Project
Reporter: Kevin A. McGrail
>From DFS idea used with permission:
We tokenize inbound messages and store the tokens on the server. In each
message, we add links for doing training. When you click on a training link,
the system trains the message based on the tokens stored on the server. In that
way, you are training using exactly the tokens that the Bayes code saw.
For SA, the key point is a framework to store the Bayesian tokens from the
email before delivery of the email so later, a "this is spam" "this is ham"
mechanism can take advantage of that information without having the entire
email.
Adding a header with the message id for the storage of the headers allows a
framework to be built for train as spam, train as ham to be more readily built.
The issues you are pointing to have to deal more with the implementation of the
this is spam/this is ham mechanism.
By storing just the tokens, there is less space and privacy & legal concerns
are mitigated.
sa-learn would then be extended to use the message id and learn as spam/ham
instead of feeding it the entire message.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]