RW-15 wrote:
> 
> On Fri, 12 Feb 2010 17:51:12 +0000
> RW <rwmailli...@googlemail.com> wrote:
> 
>> On Fri, 12 Feb 2010 09:17:54 -0800 (PST)
>> smfabac <smfa...@att.net> wrote:
>> 
>> > 
>> 
>> > Mark, 
>> > 
>> > On UNIX any file is a mbox file if it contains mail messages in the
>> > form:
>> > 
>> > ^A^A^A^A
>> > mail headers
>> > mail body
>> > ^A^A^A^A
>> > ^A^A^A^A
>> > Next Message mail headers
>> > mail body
>> > ^A^A^A^A
>> 
>> I don't know what that is, but it's not a standard mbox format.
>> 
>> In mbox format the emails all start with a blank line and a From.
> 
> 
> It appears to be mmdf format
> 
> http://www.washington.edu/imap/documentation/formats.txt.html
> 
> 

Ok, 

Now that we're all on the same page. How do I find out why sa-learn
is not processing the legal not-spam file?  To re-cap, "sa-learn --spam
--mbox isspam" works but "sa-learn --ham --mbox not-spam" is not
working.  

The sa-learn --dump magic shows that messages have been 
added by the sa-learn command:

$ sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0      12551          0  non-token data: nspam
0.000          0      68020          0  non-token data: nham
0.000          0     143948          0  non-token data: ntokens
0.000          0 1260104403          0  non-token data: oldest atime
0.000          0 1266048014          0  non-token data: newest atime
0.000          0 1266049794          0  non-token data: last journal sync
atime
0.000          0 1265630710          0  non-token data: last expiry atime
0.000          0    5529600          0  non-token data: last expire atime
delta
0.000          0      19095          0  non-token data: last expire
reduction co
unt

$ sa-learn --spam --mbox isspam
Learned tokens from 1 message(s) (1 message(s) examined)
$

$ sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0      12552          0  non-token data: nspam
0.000          0      68020          0  non-token data: nham
0.000          0     144608          0  non-token data: ntokens
0.000          0 1260104403          0  non-token data: oldest atime
0.000          0 1266048014          0  non-token data: newest atime
0.000          0 1266049794          0  non-token data: last journal sync
atime
0.000          0 1265630710          0  non-token data: last expiry atime
0.000          0    5529600          0  non-token data: last expire atime
delta
0.000          0      19095          0  non-token data: last expire
reduction co
unt
$ 

As you can see the nspam has incremented by 1.

$ sa-learn --ham --mbox not-spam
Learned tokens from 0 message(s) (0 message(s) examined)
$ 

Read Create Save Delete Undelete Print Folder Options Quit
Set mail options and preferences                                                
Folder: not-spam                                Saturday February 13, 2010 
2:34
---------------------------------- [1] Message 
--------------------------------
          1 gerb...@zenez.co  11 Feb 10 6404  Quarterly ASCII posting of SCO
Uni


Is there a message size limit for sa-learn?  The message in not-spam is 
plain ascii, no html.

$ wc -l not-spam
   6408 not-spam  <-- sa-learn --ham failed on not-spam folder with one
message
$ 
$ wc -l isspam
   1039 isspam   <-- sa-learn --spam worked on isspam folder with one
message
$ 
-- 
View this message in context: 
http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27573012.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Reply via email to