how about just a -D output?
Theo,
Just to make sure I wasn't crazy, I backed up my old tokens and seen files, and re-ran 2.60rc2's sa-learn -D on all my 16,000+ spams. The end result:
Learned from 16263 messages (16397 messages examined). debug: bayes: 4026 untie-ing debug: bayes: 4026 untie-ing db_toks debug: bayes: 4026 untie-ing db_seen debug: bayes: files locked, now unlocking lock unlock: 4026 unlink failed: /home/ben/.spamassassin/bayes.lock debug: unlock: 4026 unlink /home/ben/.spamassassin/bayes.lock
(I have no idea why it failed to unlock..)
Then, immediately after:
[EMAIL PROTECTED]:~$ sa-learn --dump magic
0.000 0 2 0 non-token data: bayes db version
0.000 0 4 0 non-token data: nspam
0.000 0 0 0 non-token data: nham
0.000 0 461 0 non-token data: ntokens
0.000 0 1062010981 0 non-token data: oldest atime
0.000 0 1062017323 0 non-token data: newest atime
0.000 0 1062010981 0 non-token data: last journal sync atime
0.000 0 1062010981 0 non-token data: last expiry atime
0.000 0 0 0 non-token data: last expire atime delta
0.000 0 0 0 non-token data: last expire reduction count
Do I just have too many spams? Or is auto-learning somehow messing up my final result?
Interestingly, the resulting bayes_toks file is tiny compared to the old one:
-rw------- 1 ben ben 1318912 Aug 27 13:51 bayes_seen -rw------- 1 ben ben 1331200 Aug 27 10:52 bayes_seen.old -rw------- 1 ben ben 65536 Aug 27 13:47 bayes_toks -rw------- 1 ben ben 11468800 Aug 27 10:52 bayes_toks.old
Ben
------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk