Hello all, Following a hard drive crash I am rebuilding my small home server on a Fedora17 platform.
One of the casualties of the HD crash was my spam corpus. I had a (very old) backup which happened to include a previous spam corpus so I used that to sa-learn. All my messages hit BAYES_00. I don't have many "fresh" spams. I do not run a SMTP server, I simply collect mail for my family and myself from my ISP and other sources using fetchmail. My ISP seem to filter most of the really bad stuff so I get just a trickle of spams (about 1 per day - if that) but even those hit BAYES_00 despite sometimes being identical to a previous FN that had already been learned with sa-learn. Here is my --dump magic: ================================8<========================================= $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 4551 0 non-token data: nspam 0.000 0 3054 0 non-token data: nham 0.000 0 198095 0 non-token data: ntokens 0.000 0 1346143801 0 non-token data: oldest atime 0.000 0 1349506984 0 non-token data: newest atime 0.000 0 1349493620 0 non-token data: last journal sync atime 0.000 0 1349476411 0 non-token data: last expiry atime 0.000 0 1382400 0 non-token data: last expire atime delta 0.000 0 171403 0 non-token data: last expire reduction count ================================8<========================================= What - if anything - can I do to improve bayes performance? Thanks in advance Mark
signature.asc
Description: This is a digitally signed message part