The --dir option in sa-learn must be broken. I guess that it does not
process files that have spaces in them, or something similar. I solved the
problem by converting Outlook Express folder (.dbx) into mbox format using
DbxConv utility (http://people.freenet.de/ukrebs/dbxconv.html), and then
processing the .mbx file. It works great. I first move all required e-mails
into SpamNotDetected or Ham folder in OE, and then run this batch (warning,
lines probably broken):

@echo off
copy /Y "L:\Documents and Settings\Harri\Local Settings\Application
Data\Identities\{A634EAAA-98E9-45DB-88A1-07E2FD707D67}\Microsoft\Outlook
Express\SpamNotDetected.dbx" "L:\Documents and Settings\Harri\My
Documents\Spam\Dbx"
copy /Y "L:\Documents and Settings\Harri\Local Settings\Application
Data\Identities\{A634EAAA-98E9-45DB-88A1-07E2FD707D67}\Microsoft\Outlook
Express\Ham.dbx" "L:\Documents and Settings\Harri\My Documents\Spam\Dbx"
L:
cd "L:\Documents and Settings\Harri\My Documents\Spam\Dbx"
DbxConv -mbx *.dbx
if not exist "L:\Documents and Settings\Harri\My Documents\Spam\*.eml" goto
:spam1
echo Spam .eml
"E:\Programs\SpamAssassin POP3 Proxy\sa-learn" --spam --showdots --dir
"L:/Documents and Settings/Harri/My Documents/Spam" %1 %2 %3 %4
:spam1
if not exist "L:\Documents and Settings\Harri\My Documents\Ham\*.eml" goto
:spam2
echo Ham .eml
"E:\Programs\SpamAssassin POP3 Proxy\sa-learn" --ham --showdots --dir
"L:/Documents and Settings/Harri/My Documents/Ham" %1 %2 %3 %4
:spam2
if not exist "L:\Documents and Settings\Harri\My Documents\Spam\Dbx\*.mbx"
goto :spam3
echo Spam .mbx
"E:\Programs\SpamAssassin POP3 Proxy\sa-learn" --spam --showdots --mbox
"L:/Documents and Settings/Harri/My Documents/Spam/Dbx/SpamNotDetected.mbx"
%1 %2 %3 %4
echo Ham .mbx
"E:\Programs\SpamAssassin POP3 Proxy\sa-learn" --ham --showdots --mbox
"L:/Documents and Settings/Harri/My Documents/Spam/Dbx/Ham.mbx" %1 %2 %3 %4
:spam3
del /Q "L:\Documents and Settings\Harri\My Documents\Spam\*"
del /Q "L:\Documents and Settings\Harri\My Documents\Ham\*"
del /Q "L:\Documents and Settings\Harri\My Documents\Spam\Dbx\*"
:end
pause

One question, still: How can I find out the Bayes database status? I have
only SAproxy and sa-learn, do I need to download the normal SpamAssasin
executable as well?

Original message below:
-------------------------

I just yesterday installed SAProxy for use with Outlook Express POP3
 account. It works, but when I try to teach it my old spam collection, it
 does not learn from it! I use the --dir option, and save e-mails in .eml
 format from OE. Sometimes it learns a couple of messages from one
 hundred. I also got some spam that SA didn't detect, and saved those
 messages into Spam folder, run sa-learn, and again it said 0 messages.

 The only explanation I can think is that perhaps they all have the same
 message id? Then sa-learn should have an option to ignore the message
 id, and just add it. It does not matter to add the same spam message
 twice, either, because then it just gets extra weight?

 Also shouldn't SA detect that messages have the same id, and mark the
 message as spam? It could have a dynamic database of the latest one
 million messages that I got, for example.

 I don't know if sa-learn fails because of duplicate message ids, though.
 Any other suggestions are welcome. I also tried -D switch once but it
 didn't say anything about the ignored messages.

 How does the --forget option work? If I use it with all spam, do they
 get learned as separate messages or does it always remove the old data
 from database first?

 --
 Harri



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to