At 10:52 AM 11/11/2004, Ronan wrote:
what format does sa-learn expect mail to be in.
I tried to feed it a 2meg standard unix email file of my spam folder and it only registered it as one email but with 4000 tokens...


Anyone... there is nothing in hte man [page about it

by default, sa-learn assumes that any files passed in each contian a single email in rfc 822 format. It assumes that any directories are maildirs.


You're probably passing a unix mailbox file, instead of a unix email file (note a mailbox is not an email, but contains several)

If you are passing unix mailbox files, you need --mbox as a parameter to sa-learn.

Some unix tools, such as UW imap, generate a variant mailbox format that is supported by using --mbx. If --mbox causes problems, try --mbx.

(note: despite using .mbx file extensions, mozilla appears to use mbox format, but it's not 100% clear if it's just a name change or if they really did change formats after 1.0)


Reply via email to