On Sat, 20 Apr 2013, Joe Acquisto-j4 wrote:

In order to send the samples, the user will forward the messages, as an attachment. Each is an individual message to either ham or spam, with the (hopefully) correct attachment.

Are you extracting the attachments off those messages to feed to sa-learn? Or are you feeding in the entire forwarded message including the attachment?

If the latter, you're training stuff you shouldn't be (the headers of the submission to the training folders) and you'll see every user's submission of the same multi-recipient spam as being learned separately.

This is one reason it's better, if possible, to have global training folders that users can just move/copy messages into. If training submissions pass though your mail system again, things get complicated.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Christian martyrs don't explode.                         -- Marisol
-----------------------------------------------------------------------
 3 days until Max Planck's 155th birthday

Reply via email to