On Sat, 6 Oct 2012, Arthur Dent wrote:

Following a hard drive crash I am rebuilding my small home server on a
Fedora17 platform.

One of the casualties of the HD crash was my spam corpus. I had a (very
old) backup which happened to include a previous spam corpus so I used
that to sa-learn.

All my messages hit BAYES_00.

Well, you're probably going to have to re-train from scratch.

Review every message in your training corpora to ensure they are properly classified.

Add a bunch of new ham and, if you have any, new spam.

Very old spam (say, >5 years) may not be too useful, and probably should be omitted, unless you have a very small spam corpus.

Turn off autolearn. I'm in a similar situation and hand-training on the rare misses works great for me.

Also, given your low volume, I would recommend quarantining all spam, and not having a discard threshold score over which spams are thrown out unseen. Any that do get delivered can be reviewed and added to your spam training corpus.

Zap your Bayes database, re-train and see how it goes.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  ...wind turbines are not meant to actually be an efficient way to
  supply the power grid, rather they're prayer wheels for New Age
  iBuddhists, their whirring blades drawing white guilt from the
  atmosphere and pumping it safely underground.                -- Tam
-----------------------------------------------------------------------
 Tomorrow: the first private ISS resupply mission (SpaceX/Dragon)

Reply via email to