On December 17, 2003 11:20 am, stan wrote: > On Wed, Dec 17, 2003 at 11:00:04AM -0500, Pedro Sam wrote: > > On December 17, 2003 10:16 am, stan wrote: > > > NTW, I've got a macro that runs sa-lar, and another that runs > > > spamassian -r. If I run the 2nd one first, I get a message about 0 > > > messages learned from, if I run the first one. Whereas, If I reverse > > > the order, I get 1 message learned. So it looks to me that I can't > > > reproduce your error here. > > > > ... sorry, I wasn't clear before ... > > > > reporting and learning both works with "spamassassin -r". BUT!! > > remember that SA markup must be stripped before reporting or learning. > > Now, > > > > 1. "sa-learn" command automatically strip SA markup before learning, > > WORKS! 2. "spamassassin -r" command claims to strip SA markup before > > reporting, WORKS! (ie it reports the spam without SA markup) > > 3. "spamassassin -r" command claims to strip SA markup before learning, > > DOES NOT WORK!! (ie it learns the spam WITH SA markup) > > > > Why did I suspect that 3 did not work? because I found many tokens in > > the bayes database that could only had come from SA markup. Tokesn like > > "BAYES_99" were considered VERY spammy. > > > > I 'm begging you, can someone please either confirm this problem so we > > can report it, or someone tell me that it's my problem only ... > > OK, if the problem exists, I should have it. But I'm a newbie here. Tell me > how to check my tokens, and I'll reprt back.
try this: sa-learn --dump all | sort -n > SOME_FILE You should get something like the following: ... 0.978 2 0 1067239234 UD:mygrantnow.org 0.985 3 0 1066771497 N:junkN.jpg 0.958 1 0 1067155182 N:NsN-NkwN-N-jNiN 0.958 1 0 1067040199 H*r:8LN3VP9W.vip.fi 0.985 3 0 1071089788 comp-01_05.gif 0.958 1 0 1067324476 HTo:U*sarajonsson 0.958 1 0 1066969001 H*M:7719 0.958 1 0 1067081011 H*m:h9PBOoog018734 ... The first column should be the "spamminess", second is the # occurrence as spam, third is the # occurence as ham, fourth is the time (in unix seconds), and fifth is the token itself... So if you find tokens that could only had came from SA markup (stuff like BAYES_99), then it probably meant the mechanism used to invoke bayes learning did not strip the SA markup... Pedro -- Sauron is alive in Argentina! ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk