Update: Still NOT working, but I'm giving it hell trying to figure out why :)
First a couple of answers to other's questions: - John, others, not an ISP, high is relative I'm sure but the volume is much higher than I can duplicate and review every flagged message. Right now running at about 10% before I migrate one of my larger domains. Mail is relayed to exchange servers. Users do not have imap accounts on box. A few local users with POP only. I don't configure or allow anyone to submit messages for training directly. - re no, or careful auto-training. I get it. I'm migrating from a server that's run for years with auto-learn on set at conservative learn values. Never had any trouble with it thank goodness. As I look at the messages that would be autolearned, I've never found one that would have learned that should not have in my corpus. The volume would just be too high to personally go through each one of them myself. I have had "problem" users that get a lot of spam misses and I plan to set up a way for them to submit their spam to me (not autolearn) for review and manual training as needed. - Matus: re:" autolearn=unavailable apparently due to not accessible bayes database [due to permissions]". I hope you are right. That would make sense to me. See below please. I think I listed them all. Config and permissions look good to me, I'm grateful to have anything I missed pointed out by an experienced eye. My old server, running embarrassingly old versions of everything works great. So the auto-learn in general has been a good fit for my environment. I get it that it's not for everyone. But a tleast it SHOULD work, and let me choose to tweak it or turn it off. As far as I can tell it is not working, at all. So here's where I am: 1. I stepped back and went through all my configurations carefully. spamassassin is being run via amavisd, as the amavis user. Site wide config, no other users have direct access. POP accounts and relay accounts only. 2. From prior research before asking for help, I understood no spam was necessary for auto-learn to work but one person here said I had to be at the minimum (200 default) before it would. So, to rule that out as the issue, I manually fed it plenty of spam and ham. For others who might read this thread archived, I was having trouble getting enough learned due to the default size limit my version of SA/sa-learn had. With some digging I found out how to raise that limit and then I had plenty of spam to feed: su amavis -c 'sa-learn -D --spam --showdots --max-size=1000000 --mbox /home/mail/spam' [root@mail2 amavisd]# su amavis -c 'sa-learn --dump magic' 0.000 0 3 0 non-token data: bayes db version 0.000 0 349 0 non-token data: nspam 0.000 0 478 0 non-token data: nham 0.000 0 166030 0 non-token data: ntokens 0.000 0 1501594564 0 non-token data: oldest atime 0.000 0 1502289189 0 non-token data: newest atime 3. Next up were questions about the config and permissions. I checked my setup, it looked OK, but I even opened some directories up 777 for testing This is my config, I'd be grateful if anyone sees anything wrong point it out: I include the amavis stuff just to show it is running and invoked as and by amavis user 3a. amavis in /usr/lib/systemd/system/amavisd.service User=amavis Group=amavis ExecStart=/usr/sbin/amavisd -c /etc/amavisd/amavisd.conf > amavis user's home dir per /etc/passwd is: /var/spool/amavisd verified with cd ~amavis 3b. local.cf > My spamassassin local.cf is at: /etc/mail/spamassassin/local.cf > verified this is the one being used by putting an error > line and restarting amavisd. It compalins about the error. > Fixed of cousre and continue... > in local.cf I have these related settings: use_bayes 1 bayes_auto_learn 1 bayes_auto_learn_threshold_nonspam -1.7 bayes_auto_learn_threshold_spam 10.0 bayes_path /etc/mail/bayes/bayes bayes_file_mode 0777 3c. bayes > for troubleshooting I set the permissions to 777 on /etc/mail/bayes and it's > files > This is the only occurrence of the "bayes" files on the server [root@mail2 amavisd]# ls -la /etc/mail/bayes total 4196 drwxrwxrwx 2 amavis amavis 4096 Aug 9 13:49 . drwxr-xr-x 4 amavis amavis 4096 Aug 3 13:02 .. -rwxrwxrwx 1 amavis amavis 86016 Aug 9 09:51 bayes_seen -rwxrwxrwx 1 amavis amavis 5246976 Aug 9 13:49 bayes_toks 3d. amavis spamassassin folder settings > For amavis which is calling spamassassin via it's > perl libraries (I am not running spamd), > I have it's related configuration parts as: $MYHOME = '/var/spool/amavisd'; # a convenient default for other settings, -H $TEMPBASE = "$MYHOME/tmp"; # working directory, needs to exist, -T $ENV{TMPDIR} = $TEMPBASE; # environment variable TMPDIR, used by SA, etc. $db_home = "$MYHOME/db"; # dir for bdb nanny/cache/snmp databases, -D #$helpers_home = "$MYHOME/var"; # working directory for SpamAssassin, -S $helpers_home = "$MYHOME"; # working directory for SpamAssassin, -S 3e. spamassassin directory > And for spamassassin, it's files are being placed in the amavisd home > directory as configured in amavisd.conf. > I am careful to only run sa-update, or SA debug commands as amavisd user so > as not to create any other > .spamassassin folders under root, etc. > this is the only occurrence of .spamassassin on the server: [root@mail2 amavisd]# locate .spamassassin /var/spool/amavisd/.spamassassin /var/spool/amavisd/.spamassassin/user_prefs 3f. amavis (spamassassin's user) home directory [root@mail2 amavisd]# ls -la /var/spool/amavisd total 32 drwxr-x--- 6 amavis amavis 4096 Aug 9 20:49 . drwxr-xr-x 8 root root 4096 Nov 5 2016 .. -rw------- 1 amavis amavis 101 Aug 9 11:17 .bash_history -rw-r--r-- 1 amavis amavis 0 Aug 9 20:49 black.lst drwxr-x--- 2 amavis amavis 4096 Aug 9 20:30 db drwxr-x--- 2 amavis amavis 4096 Apr 19 07:28 quarantine drwx------ 2 amavis amavis 4096 Aug 8 15:32 .spamassassin drwxr-x--- 5 amavis amavis 4096 Aug 10 08:26 tmp -rw-r--r-- 1 amavis amavis 37 Aug 7 19:28 white.lst 3g. .spamassassin folder [root@mail2 amavisd]# ls -la /var/spool/amavisd/.spamassassin total 12 drwx------ 2 amavis amavis 4096 Aug 8 15:32 . drwxr-x--- 6 amavis amavis 4096 Aug 9 20:49 .. -rw-r--r-- 1 amavis amavis 1869 Aug 8 15:32 user_prefs 4. Logging I managed to get Amavisd configured to let the more verbose rule listing for the header, and score details in the log come through for my troubleshooting as well. 5, results: After running this config now, with a loaded bayes database, it has yet to auto-learn a single spam (or ham). Just through yesterday my spam quarantine has over 50 pretty high scoring spams in it. I've studied tflags and now understand what they are (for others here's a good link): http://commons.oreilly.com/wiki/index.php/SpamAssassin/SpamAssassin_Rules I understand SA requires at least 3 points from the header and 3 points from the body, to auto-learn as spam. I understand some tflags preclude the use of the test in the autolearn score. I understand bayes points don't count. But surely one of the 50 high scores I caught yesterday qualified. Yet, no autolearn. Always autolearn=unavailable or no. I've turned on verbose debugging for bayes but I don't see any errors or feedback on reasons for the no-learn. Looked at yesterday's log: cat /var/log/maillog.1|grep autolearn=unavailable|wc -l 60 Now amavisd has the option of giving a verbose log line with all the score stuff. Now amavis adds a "autolearn score" to the log as well. Not sure how that is calculated, but it's interesting anyway. Be great if it were h/b/t (header/body/total). Anyway, sample: Aug 10 00:38:39 mail2 amavis[15959]: (15959-08) Blocked SPAM {DiscardedInbound,Quarantined}, [89.43.62.101]:47955 [89.43.62.101] ESMTP/LMTP <cont...@hewis.versateye.com> -> <shor...@myvirt.org>, (ESMTP://[89.43.62.101]:47955), quarantine: spa...@myvirt.org, Queue-ID: 7F64A70, mail_id: yxtV5c7b1N8r, b: tDtWV84sR, Hits: 23.553, size: 365419, Subject: "Joanna Gaines Drops Bombshell.", From: <cont...@hewis.versateye.com>, helo=hewis.versateye.com, Tests: [BAYES_999=0.2,BAYES_99=3.5,DATE_IN_PAST_03_06=1.592,DCC_CHECK=3.2,DIGEST_MULTIPLE=0.293,HTML_MESSAGE=0.001,HTML_MIME_NO_HTML_TAG=0.377,MIME_HTML_ONLY=0.723,MISSING_MID=0.497,NORMAL_HTTP_TO_IP=0.001,RAZOR2_CF_RANGE_51_100=0.5,RAZOR2_CF_RANGE_E8_51_100=1.886,RAZOR2_CHECK=2.5,RCVD_IN_BRBL_LASTEXT=1.449,RDNS_NONE=0.793,SPF_HELO_PASS=-0.001,SPF_PASS=-0.001,STYLE_GIBBERISH=3.093,URIBL_ABUSE_SURBL=1.25,URIBL_BLACK=1.7], autolearn=unavailable autolearn_force=no, autolearnscore=21.113, 5061 ms As usual, autolearn=unavailable. My suspicion is many of those "unavailable" should have been a learn. Surely out of 60, one was valid to autolearn. I don't know what to look for next to troubleshoot. Sure hoping it's just a permissions issue. I'm back to a brick wall. How can I help you help me?