> First, run spamassassin -tD <sample-spam.txt.. look at the debug
> output. Is bayes even enabled? are there enough tokens?

debug: Score set 0 chosen.
debug: running in taint mode? no
debug: using "/usr/share/spamassassin" for default rules dir
debug: using "/etc/mail/spamassassin" for site rules dir
debug: using "/root/.spamassassin" for user state dir
debug: using "/etc/MailScanner/spam.assassin.prefs.conf" for user prefs file
debug: bayes: 7125 tie-ing to DB file R/O
/var/spool/MailScanner/spamassassin/bayes_toks
debug: bayes: 7125 tie-ing to DB file R/O
/var/spool/MailScanner/spamassassin/bayes_seen
debug: debug: Only 86 ham(s) in Bayes DB < 200
debug: bayes: 7125 untie-ing
debug: bayes: 7125 untie-ing db_toks
debug: bayes: 7125 untie-ing db_seen
debug: Score set 1 chosen.
debug: Initialising learner
debug: bayes: 7125 tie-ing to DB file R/O
/var/spool/MailScanner/spamassassin/bayes_toks
debug: bayes: 7125 tie-ing to DB file R/O
/var/spool/MailScanner/spamassassin/bayes_seen
debug: debug: Only 86 ham(s) in Bayes DB < 200
debug: bayes: 7125 untie-ing
debug: bayes: 7125 untie-ing db_toks
debug: bayes: 7125 untie-ing db_seen
debug: is Net::DNS::Resolver available? no
debug: is DNS available? 0
debug: running header regexp tests; score so far=0
debug: running body-text per-line regexp tests; score so far=4.3
debug: running raw-body-text per-line regexp tests; score so far=11
debug: running uri tests; score so far=12.6
debug: uri tests: Done uriRE
debug: running full-text regexp tests; score so far=12.6
debug: Razor2 is not available
debug: DCC is not available: dccproc not found
debug: Current PATH is:
/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:
/root/bin:/w98wh/wildCAT/bin:/w98wh/wild98.com/cron
debug: Pyzor is not available: pyzor not found
debug: all '*To' addrs: [EMAIL PROTECTED]
debug: time token found: 29566705 expected (from Date): 29566705: 0
debug: all '*From' addrs: [EMAIL PROTECTED]
debug: running meta tests; score so far=12.6
debug: auto-learn? safety=4, ham=-2, spam=15, body-hits=8.3, head-hits=5.9
debug: auto-learn: currently using scoreset 1.  no need to recompute.
debug: auto-learn? no: inside auto-learn thresholds or safety zone around
required_hits
debug: is spam? score=12.7 required=5
tests=BASE64_ENC_TEXT,HAIR_LOSS,HTML_20_30,HTML_FONT_COLOR_RED,HTML_MESSAGE,
HTML_TAG_BALANCE_BODY,HTML_TAG_BALANCE_TABLE,LOSE_POUNDS,MIME_HTML_ONLY,NO_Q
S_ASKED
>From nobody Sat May 31 00:02:34 2003
Received: from localhost [127.0.0.1] by ns2.wild98webhosting.com
        with SpamAssassin (2.55 1.174.2.19-2003-05-19-exp);
        Sat, 26 Jul 2003 08:12:05 -0700
From: "Roland Crane" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: Lose weight
Date: Fri, 30 May 2003 21:20:25 +0000
Message-Id: <[EMAIL PROTECTED]>
X-Spam-Flag: YES
X-Spam-Status: Yes, hits=12.7 required=5.0
        tests=BASE64_ENC_TEXT,HAIR_LOSS,HTML_20_30,HTML_FONT_COLOR_RED,
              HTML_MESSAGE,HTML_TAG_BALANCE_BODY,HTML_TAG_BALANCE_TABLE,
              LOSE_POUNDS,MIME_HTML_ONLY,NO_QS_ASKED
        version=2.55
X-Spam-Level: ************
X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="----------=_3F229A45.12C9B0A9"

This is a multi-part message in MIME format.

------------=_3F229A45.12C9B0A9
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

This mail is probably spam.  The original message has been attached
along with this report, so you can recognize or block similar unwanted
mail in future.  See http://spamassassin.org/tag/ for more details.

Content preview:  Regain your youth The discovery that reverses signs of
  aging naturally and that is completely safe and effective is on sale
  for a limited time! Buy a two-month supply of our product and we will
  give you one month free! [...]

Content analysis details:   (12.70 points, 5 required)
LOSE_POUNDS        (4.3 points)  Subject talks about losing pounds
HAIR_LOSS          (2.8 points)  BODY: Cures Baldness
NO_QS_ASKED        (2.1 points)  BODY: Doesn't ask any questions
HTML_20_30         (1.2 points)  BODY: Message is 20% to 30% HTML
HTML_FONT_COLOR_RED (0.1 points)  BODY: HTML font color is red
HTML_TAG_BALANCE_TABLE (0.2 points)  BODY: HTML is missing "table" close
tags
HTML_MESSAGE       (0.1 points)  BODY: HTML included in message
HTML_TAG_BALANCE_BODY (0.2 points)  BODY: HTML has unbalanced "body" tags
BASE64_ENC_TEXT    (1.6 points)  RAW: Message text disguised using base-64
encoding
MIME_HTML_ONLY     (0.1 points)  Message only has text/html MIME parts


> How are you calling SA (procmail, mailscanner, amavis, etc etc)?

MailScanner

> What user does SA run as when you get email?
> What user did you run your training as?

root, but I placed all bayes* files in a global area which, according to the
debug log, it found during the above test.

I ran sa-learn on over 900MB of spam, including the message I just ran the
debug test on, and no bayes tags showed up.

Anything else I can offer the list in terms of configuration to get the ball
rolling on this? I have 3 other machines running through the sa-learn
process on this gigantic collection of spam, so knowing how to get bayes to
work afterwards would sure be nice.

ian




-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to