RW-15 wrote: > > On Fri, 12 Feb 2010 17:51:12 +0000 > RW <rwmailli...@googlemail.com> wrote: > >> On Fri, 12 Feb 2010 09:17:54 -0800 (PST) >> smfabac <smfa...@att.net> wrote: >> >> > >> >> > Mark, >> > >> > On UNIX any file is a mbox file if it contains mail messages in the >> > form: >> > >> > ^A^A^A^A >> > mail headers >> > mail body >> > ^A^A^A^A >> > ^A^A^A^A >> > Next Message mail headers >> > mail body >> > ^A^A^A^A >> >> I don't know what that is, but it's not a standard mbox format. >> >> In mbox format the emails all start with a blank line and a From. > > > It appears to be mmdf format > > http://www.washington.edu/imap/documentation/formats.txt.html > >
Ok, Now that we're all on the same page. How do I find out why sa-learn is not processing the legal not-spam file? To re-cap, "sa-learn --spam --mbox isspam" works but "sa-learn --ham --mbox not-spam" is not working. The sa-learn --dump magic shows that messages have been added by the sa-learn command: $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 12551 0 non-token data: nspam 0.000 0 68020 0 non-token data: nham 0.000 0 143948 0 non-token data: ntokens 0.000 0 1260104403 0 non-token data: oldest atime 0.000 0 1266048014 0 non-token data: newest atime 0.000 0 1266049794 0 non-token data: last journal sync atime 0.000 0 1265630710 0 non-token data: last expiry atime 0.000 0 5529600 0 non-token data: last expire atime delta 0.000 0 19095 0 non-token data: last expire reduction co unt $ sa-learn --spam --mbox isspam Learned tokens from 1 message(s) (1 message(s) examined) $ $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 12552 0 non-token data: nspam 0.000 0 68020 0 non-token data: nham 0.000 0 144608 0 non-token data: ntokens 0.000 0 1260104403 0 non-token data: oldest atime 0.000 0 1266048014 0 non-token data: newest atime 0.000 0 1266049794 0 non-token data: last journal sync atime 0.000 0 1265630710 0 non-token data: last expiry atime 0.000 0 5529600 0 non-token data: last expire atime delta 0.000 0 19095 0 non-token data: last expire reduction co unt $ As you can see the nspam has incremented by 1. $ sa-learn --ham --mbox not-spam Learned tokens from 0 message(s) (0 message(s) examined) $ Read Create Save Delete Undelete Print Folder Options Quit Set mail options and preferences Folder: not-spam Saturday February 13, 2010 2:34 ---------------------------------- [1] Message -------------------------------- 1 gerb...@zenez.co 11 Feb 10 6404 Quarterly ASCII posting of SCO Uni Is there a message size limit for sa-learn? The message in not-spam is plain ascii, no html. $ wc -l not-spam 6408 not-spam <-- sa-learn --ham failed on not-spam folder with one message $ $ wc -l isspam 1039 isspam <-- sa-learn --spam worked on isspam folder with one message $ -- View this message in context: http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27573012.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.