Am 03.12.2015 um 12:41 schrieb Jeroen de Neef:
I'd like to teach my bayes correctly especially since I don't get a lot
of emails, thanks to Reindl's list I will ignore those headers from now on.
But I don't want it to learn that the /*****spam*****/ in the subject
means that it is spam or ham, is there a way I can remove it before
throwing it at the bayesian filter? Perhaps an extra line in the config
or a bash script?

just add a replace in the php-script i posted before it verifies the new content against the old one to decide if the file needs to be rewritten

for such cleanups and anonymize i use seperated scripts to keep the code clean, one of them also reads the postfix configration and replaces own domains and email-addresses with "m...@example.com"

"I will ignore those headers from now on" - the ignore configuration is not enough, hence the formail script to strip the headers completly from the samples

the Received header is a special case - if the samples don't have any Received header you get *completly* different bayes results compared with a always identical one, hence i strip them all and add a generic at the end on top of the file

that leads also in have a dramatical reduced token number because you have at the end only one token for Received with the same date, time, host

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to