Re: Moving Spam to Junk Folder

2020-09-03 Thread David B Funk

On Thu, 3 Sep 2020, bobby wrote:


I am following this tutorial: 
https://www.linuxbabe.com/redhat/spamassassin-centos-rhel-block-email-spam.I 
followed the steps in "Move Spam
into the Junk Folder".  When I send an email from a blacklisted e-mail address, 
I get a bounce e-mail from my e-mail server.  Here is what
is in my spamass-milter file:
EXTRA_FLAGS="-m -r 8 -R NO_SPAM -i 127.0.0.1 -g sa-milt -- --max-size=512"
I would prefer it to go into my Junk folder.  How can I make this happen?


Bobby,

You need to read the spamass-milter documentation to understand what those 
options are doing.


That "-r 8" tells spamass-milter to return a 'SMIFS_REJECT' status to postfix if 
the spam score is over 8. This causes postfix to refuse to accept the message 
at all (sort of like when somebody tries to send a message to a bogus 
recipient).


So if postfix never lets spam get in the front door it cannot be delivered to 
any kind of "Junk Folder"



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Problem installing sa on my pi 3b+

2021-04-08 Thread David B Funk

On Fri, 9 Apr 2021, spamassas...@mach2.franken.de wrote:


Am 07.04.2021 um 12:27 schrieb Antony Stone:



I am running said packet install from an internet tutorial.
Who wrote that tutorial and where does it point you to get the packages 
from?



Antony.


Hmm, it says execute the following commands:

    sudo apt-get update
    sudo apt-get install spamassassin

Without any further params. How am I supposed to know where that command does 
get its package from???


Christian


Christian,

Use the "apt show spamassassin" command to show the information about the 
spamassassin package.

One of the lines of output will be something like:
 APT-Sources: http://us.archive.ubuntu.com/ubuntu bionic-updates/main amd64 
Packages

That will tell you the package repository that it's getting that particular 
package from.
For more info about the collection of sources that 'apt' & 'apt-get' are using 
look at the "sources.list" config files in /etc/apt/ directory.


HTH

Dave

--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: SA seems powerless against marketing emails for SEO/web development

2021-04-22 Thread David B Funk

On Thu, 22 Apr 2021, Matus UHLAR - fantomas wrote:


On 22.04.21 14:21, Steve Dondley wrote:

pts rule name  description
 -- 
--

-0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/,
no trust
   [209.85.210.44 listed in list.dnswl.org]
-1.0 BAYES_00   BODY: Bayes spam probability is 0 to 1%
   [score: 0.]

[snip..]

-0.0 RCVD_IN_MSPIKE_WL  Mailspike good senders

This email is bit of an outlier as most of these emails will get flagged 
with bayes_99 and bayes_999 but this one actually gives it bayes_00.



My bayes filter has been trained with about 2000 examples of spam and ham.


now, train as needed - this one as spam.


In that spam there was a tracking link at the bottom with a URL of the form:
https://name-company-track.appspot.com/Firebase?bunch-of-long-tracking-variables

How hard would it be to modify the uribl lookup code so that it did not truncate 
hosts names, so we could create uribl entries of the form 
"name-company-track.appspot.com" or would that be prohibitively expensive in 
lookups?


I regularly see phish/spam that has URL hosts of the form some-name.blogspot.com 
or other-name.webhosting.com and it would be nice to be able to slam those 
things into a uribl list (I run my own).



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Why single periods in regex in spamassassin rules?

2021-04-23 Thread David B Funk

On Fri, 23 Apr 2021, Steve Dondley wrote:


I'm looking at KAM.cf. There is this rule:

body__KAM_WEB2  /INDIA based 
IT|indian.based.website|certified.it.company/i


I'm wondering if there is a good reason why a singe period is used instead of 
something like \s+ which would catch multiple spaces whereas a singe period 
doesn't.


Because '/indian.based.website'/ will match 'indian-based_website' but \s will 
not.



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Another evil number

2021-06-25 Thread David B Funk

On Fri, 25 Jun 2021, Greg Troxel wrote:



RW  writes:


You can reach out
   to our Customer Support Team+1 (800) 781 - 2511.


Is it common in the US to put 800 in brackets like that? In my
experience brackets normally go around either country codes or area
codes, digits that may be optional.


Yes, it common.  The proper form is

 +1 800 782 2511

but people in the US do not write numbers like that.

The normal way in the US would be

 (800) 782-2511

and i find the spaces around the - to be unusual.  But really there is a
fair degree of variation.


And then there's the obfuscation that spammers/phishers use.
Here's an example from a recent message I found in one of my spam traps:


if you have any issue regarding your order.

Reach us at +1 [805} 429-6748

Thanks & Regards

+1 [805} 429-6748


Those bracket/brace mismatching are verbatium.


--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Identifying Amazon hosts...

2021-07-28 Thread David B Funk

On Wed, 28 Jul 2021, Antony Stone wrote:


On Wednesday 28 July 2021 at 19:51:49, Pedro David Marco wrote:


Hi!
i have spam with this header:

 Received: from a48-115.smtp-out.amazonses.com (HELO
a48-115.smtp-out.amazonses.com) (54.240.48.115)

Is there any way, based on its fqdn, to know whether an Amazon smtp host is
public or dedicated?


Apologies for what may seem like a silly question, but what's the difference?


I'm assuming he's asking if there's a chance that it's an open-relay SMTP server 
or one dedicated to Amazon client systems.


I'd be shocked if it was an open-relay, it'd probably be hammered by now if it 
were.


There's enough spam coming from AWS clients as-is. I've seen malware and phishes 
coming out of AWS, I wouldn't wouldn't unconditionally trust anything from 
them.



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Customise hostname shown in X-Spam-Checker-Version?

2021-07-30 Thread David B Funk

On Fri, 30 Jul 2021, David Bürgin wrote:


David Bürgin:

Resolved. Perhaps the documentation should be updated.


There are notes for options ‘remove_header’ and ‘clear_headers’ that
‘X-Spam-Checker-Version is not removable’, so a straightforward fix to
the documentation would be replacing sentence

note that Checker-Version can not be changed or removed

with

note that Checker-Version can not be removed


More to the point:
 the X-Spam-Checker-Version header is not removable and the Version-number 
WITHIN the header is not changeable, the rest of the header is customizable.




--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: CVD_IN_DNSWL_HI ?

2021-10-11 Thread David B Funk

On Mon, 11 Oct 2021, Jerry Malcolm wrote:



I am getting tons of emails that are very obviously spam (elongation, russian 
beauties, etc) that are getting a -5 score added on the white list tes
t:

CVD_IN_DNSWL_HIRBL: Sender listed at https://www.dnswl.org/, high trust

I'm curious about the usefulness of a white list that spammers have obviously been able to defeat. 
And with the -5.0 score added (subtracted) in to the total, there's almost no chance for other tests to overcome it with 10 points to get the score 
to 5.0


Whaat is the easiest way to disable this 'trusted white list' tester that is 
sabotaging so many of my spam scores?


That's one of the several sets of evals derived from the __RCVD_IN_DNSWL test of 
the "list.dnswl.org" rbl.


You can disable just the RCVD_IN_DNSWL_HI rule by setting its score to 0
EG: in your local.cf add a like that looks like:

# disable RCVD_IN_DNSWL_HI
score RCVD_IN_DNSWL_HI 0

You can disable the whole kit of rules derived from that rbl by setting the base 
rule to 0:


score __RCVD_IN_DNSWL 0


--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: CVD_IN_DNSWL_HI ?

2021-10-11 Thread David B Funk

On Mon, 11 Oct 2021, David B Funk wrote:


On Mon, 11 Oct 2021, Jerry Malcolm wrote:



I am getting tons of emails that are very obviously spam (elongation, 
russian beauties, etc) that are getting a -5 score added on the white list 
tes

t:

CVD_IN_DNSWL_HIRBL: Sender listed at https://www.dnswl.org/, high trust

I'm curious about the usefulness of a white list that spammers have 
obviously been able to defeat. And with the -5.0 score added (subtracted) 
in to the total, there's almost no chance for other tests to overcome it 
with 10 points to get the score to 5.0


Whaat is the easiest way to disable this 'trusted white list' tester that 
is sabotaging so many of my spam scores?


That's one of the several sets of evals derived from the __RCVD_IN_DNSWL test 
of the "list.dnswl.org" rbl.


You can disable just the RCVD_IN_DNSWL_HI rule by setting its score to 0
EG: in your local.cf add a like that looks like:

# disable RCVD_IN_DNSWL_HI
score RCVD_IN_DNSWL_HI 0

You can disable the whole kit of rules derived from that rbl by setting the 
base rule to 0:


score __RCVD_IN_DNSWL 0



The other thing you should do is to report false-positives to the dnswl.org 
site.

See: https://www.dnswl.org/?page_id=17

You first might want to verify that your FPs aren't being generated by some 
upstream relay that is is trusted but due to some configuration issue is 
"masking" the spam source.


If you put a copy of one of the offending spams in pastebin.com and post the URL 
here we can look at it with you to see if we can spot your issue.



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: handle_user and connect to spamd failed

2021-10-19 Thread David B Funk

On Tue, 19 Oct 2021, Linkcheck wrote:


Ok, thanks, Dave.


'--helper-home-dir' option needs an '='


Also, --max-children?

I have been playing with options based on suggestions here. I now have the 
spamassassin options as:


OPTIONS="--nouser-config -4 -i 127.0.0.1 --max-children=5 
--helper-home-dir=/var/lib/spamassassin -u debian-spamd"


and the spamass-milter options:

OPTIONS="-u spamass-milter -- -d 127.0.0.1"

Once I remembered that spamass-milter also needed to be restarted, along with 
spamassassin and postfix, I made more progress. :(


That has fixed both warnings but the warning message "Could not retrieve 
sendmail macro 'i'" has returned; thought I'd got rid of that one for good. I 
tried adding 'i' to the postfix milter_connect_macros but no difference. I've 
never discovered what that macro is supposed to be nor whence/how it derives.


Thanks to everyone who has contributed to this thread. If someone could round 
it off with the i macro solution that should be it.


spamass-milter wants the 'i' macro in both the milter_mail_macros and 
milter_rcpt_macros postfix config parameters.
Putting it in the milter_connect_macros doesn't do any good, that's not where 
spamass-milter looks for it.
(at least in the version 0.3.2 code that I looked at, YMMV version wise, grep 
the Source Luke).


The 'i' macro is supposed to be the message queue-id value.


--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Emails from gmail.com bypassing Spamassassin scoring

2022-02-07 Thread David B Funk

How big was the message? (attached images can be pretty big).

Depending on the "glue" you use to connect your mail MTA to SA, it may have some 
kind of size restriction.


For example, the 'spamc' client has a 'max-size' parameter (which defaults to 
500KB). Any message larger than that size will not be passed to SA (IE it will 
skip scanning).


Does your MTA log the SA processing? Can you see any logged errors associated 
with that particular message?


On Mon, 7 Feb 2022, Chad wrote:


All of the other emails that were sent before and after this particular email 
have the X-Spam-Status and X-spam-Report scoring,

So Spamassassin was running correctly.



-Original Message-
From: Marc 
Date: Monday, February 7, 2022 at 1:49 PM
To: Chad , "users@spamassassin.apache.org" 

Subject: RE: Emails from gmail.com bypassing Spamassassin scoring


I have been getting numerous emails lately from various gmail.com
accounts.  They are spam or phishing emails and today I got one that
had a subject of RECEIPT 5454 and only a JPG image of an invoice.
There was no content in the email.



It bypassed Spamassassin scoring.  Do you know why or what setting I
need to set so EVERY email goes through Spamassassin scoring procedures?




I do not see X-Spam headers[1], so your spamassassin was not working?


[1]
X-Spam-Status: No, score=-0.4 required=3.0 tests=ALL_TRUSTED,SPF_NEUTRAL,
TVD_SPACE_RATIO,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no
version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
4422b522-8a2b-4864-9498-4f2d06aca485



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: info: dns: bad dns reply: bgread: recv() failed

2022-09-28 Thread David B Funk

On Thu, 29 Sep 2022, Matus UHLAR - fantomas wrote:

[snip..]
/usr/local/share/perl/5.28.1/Mail/SpamAssassin/DnsResolver.pm line 742, 
 line 189.
Wed Sep 28 21:46:55 2022 [9418] info: dns: bad dns reply: bgread: recv() 
failed: Connection refused at 
/usr/local/share/perl/5.28.1/Mail/SpamAssassin/DnsResolver.pm line 742.


That looks like BIND or a packet filter refusing the query packet or 
possibly a case of failed fallback to TCP when a reply was too big for UDP.


Are you certain that BIND is configured to do recursion for 127.0.0.1 and 
doesn't have anything blocking port 53 for both UDP and TCP?




root@nmail:/var/log# cat /etc/resolv.conf
nameserver 127.0.0.1


sure it is BIND running on localhost?

sudo netstat -unlpe


bind9 running
Sep 28 21:45:49 nmail named[12447]: zone 127.in-addr.arpa/IN: loaded 
serial 1
Sep 28 21:45:49 nmail named[12447]: zone 255.in-addr.arpa/IN: loaded 
serial 1
Sep 28 21:45:49 nmail named[12447]: zone domain.nmail/IN: 
sig-re-signing-interval less than 3 * refresh.
Sep 28 21:45:49 nmail named[12447]: zone domain.nmail/IN: loaded serial 1 
(DNSSEC signed)
Sep 28 21:45:49 nmail named[12447]: zone 190.120.37.in-addr.arpa/IN: 
loaded serial 1

Sep 28 21:45:49 nmail named[12447]: zone localhost/IN: loaded serial 2
Sep 28 21:45:49 nmail named[12447]: all zones loaded
Sep 28 21:45:49 nmail named[12447]: running
Sep 28 21:45:49 nmail named[12447]: zone domain.nmail/IN: reconfiguring 
zone keys
Sep 28 21:45:49 nmail named[12447]: zone domain.nmail/IN: next key event: 
28-Sep-2022 22:45:49.345


Does:
  dig @localhost google.com

get you a valid answer or does it give you an error message:

dbfunk@a-lnx000:bin> dig @localhost google.com

; <<>> DiG 9.11.2 <<>> @localhost google.com
; (2 servers found)
;; global options: +cmd
;; connection timed out; no servers could be reached

If you get that kind of an error message that tends to indicate that either your 
bind is not configured to listen on 'localhost' or there's some strange firewall 
issue going on.


locate your bind's "named.conf" file and look for a "listen-on" parameter.
It should contain the value "any" or explicitly list the various appropriate 
addresses, including the "127.0.0.1" localhost address.




--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Aw: Re: info: dns: bad dns reply: bgread: recv() failed

2022-09-29 Thread David B Funk

On Thu, 29 Sep 2022, Maurizio Caloro wrote:


First let me thanks for your quick help, yes now are running:-)

mistake:
named.conf.options
  -listen-on { A.B.C.D, localhost; };
  +listen-on { any; };
After this, the error in Spamd.log disapper, greate!


Your mistake is that 'localhost', you need to have a real IP address there.
use '127.0.0.1' instead of localhost in that listen-on statement, and also use 
';' for component separators, not ','


IE
  listen-on { A.B.C.D; 127.0.0.1; };

the key-word 'any' means to discover and bind to all possible interfaces on the 
machine.




but now i see in main.log, this message:
Sep 29 21:15:05 nmail postfix/smtp[26109]: warning: DNSSEC validation may be 
unavailable
Sep 29 21:15:05 nmail postfix/smtp[26109]: warning: reason: dnssec_probe 'ns:.' 
received a response that is not DNSSEC validated

i see this as warning, and i think i dont need intervention here?


If you want your postfix to be able to validate DNSSEC signed DNS replys you 
need to set up DNSSEC infrastructure. (postfix issue, not spamd).



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: How do I check for a jpeg attachment?

2022-10-03 Thread David B Funk

On Mon, 3 Oct 2022, Loren Wilton wrote:

I'm getting a bunch of spams from fake gmail accounts that consist of one 
short line of text and a 2 MB jpg file.

The subject and body text are pretty much random beyond that.

How do I check for the following?

--e345f305ea2680cd
Content-Type: image/jpeg; name="MMM.jpg"
Content-Disposition: attachment; filename="MMM.jpg"
Content-Transfer-Encoding: base64
Content-ID: 
X-Attachment-Id: f_l8t6clr50

I want to match on /^Content-Type: image\/jpeg;/ but I can't figure out how 
to do that. rawbody doesn't seem to work.


Use the specific 'mimeheader' rule type:

mimeheader L_IMAGE3eContent-Type =~ m!image/jpe?g;!i
describe L_IMAGE3e  Has JPG image attachment
score L_IMAGE3e 0.2




--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Rule Help - not sure what is wrong with my syntax

2023-01-13 Thread David B Funk

On Sat, 14 Jan 2023, Benny Pedersen wrote:


Benny Pedersen skrev den 2023-01-14 03:59:

header TO_SPECIFIC_DOMAIN To:addr =~ /\@(test|junc)\.(com|net|eu)$/
describe TO_SPECIFIC_DOMAIN Mail sent to test.com or test.net email addresses
score TO_SPECIFIC_DOMAIN -0.5

tested works if i mail myself :=)


Benny,

Does it work if you mail To: 
Note that having an '>' character at the end of an address is valid if it has a 
matching '<' but that should fail your "(com|net|eu)$/" test because of the 
anchoring '$'



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: comparing sender domain against recipient domain

2023-05-11 Thread David B Funk



what useful information would you be looking for from this kind of comparison?
All the time I receive mail from people with non-local domains and regularly 
receive e-mail from co-workers using the same domain as me.


The kind of things that might be useful are:
1) detecting local-domain forgeries (IE if you have DKIM/SPF, etc and the 
message appears to be from your domain but fails those checks)
2) examining the "comment" part of the From: address to see if it contains a 
misleading 'domain-like' text.

EG: From: "b...@my.domain.org" 


On Thu, 11 May 2023, Marc wrote:


I was wondering if spamassassin is applying some sort of algorithm to comparing 
sender domain against recipient domain to detect a phishing attempt?




--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: SpamAssassin repeatedly fails to start

2023-07-12 Thread David B Funk

On Wed, 12 Jul 2023, Wingfully Team via users wrote:


Hi,

I’m using SpamAssassin 3.4.0 on a VPS hosted by Hostinger with CentOS 7. 
CyberPanel was installed by Hostinger.

I am constantly (every 90 seconds) seeing spamassassin fail to start, seemingly 
because it can’t find the PID file. I’m sending and receiving emails fine (it 
seems), but this is not only filling up logs/disk space, I’m also worried 
something else is misconfigured which could potentially be causing other 
problems. Here are the logs from /var/log/messages:

Jul 12 23:14:02 wingfully systemd: spamassassin.service start operation timed 
out. Terminating.
Jul 12 23:14:02 wingfully systemd: Unit spamassassin.service entered failed 
state.
Jul 12 23:14:02 wingfully systemd: spamassassin.service failed.
Jul 12 23:14:02 wingfully systemd: spamassassin.service holdoff time over, 
scheduling restart.
Jul 12 23:14:04 wingfully systemd: Can't open PID file /run/spamassassin.pid 
(yet?) after start: No such file or directory
Jul 12 23:15:32 wingfully systemd: spamassassin.service start operation timed 
out. Terminating.
Jul 12 23:15:33 wingfully systemd: Unit spamassassin.service entered failed 
state.
Jul 12 23:15:33 wingfully systemd: spamassassin.service failed.
Jul 12 23:15:33 wingfully systemd: spamassassin.service holdoff time over, 
scheduling restart.
Jul 12 23:15:34 wingfully systemd: Can't open PID file /run/spamassassin.pid 
(yet?) after start: No such file or directory

Here’s the output from systemctl status spamassassin -l

● spamassassin.service - Spamassassin daemon
  Loaded: loaded (/usr/lib/systemd/system/spamassassin.service; enabled; vendor 
preset: disabled)
 Drop-In: /etc/systemd/system/spamassassin.service.d
  └─override.conf
  Active: activating (start) since Wed 2023-07-12 23:29:07 EDT; 1min 5s ago
 Process: 5193 ExecStart=/usr/bin/spamd --pidfile /var/run/spamd.pid 
$SPAMDOPTIONS (code=exited, status=0/SUCCESS)
 Process: 5191 ExecStartPre=/sbin/portrelease spamd (code=exited, 
status=0/SUCCESS)
  CGroup: /system.slice/spamassassin.service
  ├─5198 /usr/bin/spamd --pidfile /var/run/spamd.pid -d -c -m5 -
  ├─5199 spamd chil
  └─5200 spamd chil

Jul 12 23:29:07 wingfully.host systemd[1]: Stopped Spamassassin daemon.
Jul 12 23:29:07 wingfully.host systemd[1]: Starting Spamassassin daemon...
Jul 12 23:29:07 wingfully.host spamd[5193]: logger: removing stderr method
Jul 12 23:29:09 wingfully.host spamd[5198]: spamd: server started on 
IO::Socket::IP [127.0.0.1]:783, IO::Socket::IP [::1]:783 (running version 3.4.0)
Jul 12 23:29:09 wingfully.host spamd[5198]: spamd: server pid: 5198
Jul 12 23:29:09 wingfully.host systemd[1]: Can't open PID file 
/run/spamassassin.pid (yet?) after start: No such file or directory
Jul 12 23:29:09 wingfully.host spamd[5198]: spamd: server successfully spawned 
child process, pid 5199
Jul 12 23:29:09 wingfully.host spamd[5198]: spamd: server successfully spawned 
child process, pid 5200
Jul 12 23:29:09 wingfully.host spamd[5198]: prefork: child states: IS
Jul 12 23:29:09 wingfully.host spamd[5198]: prefork: child states: II

I can’t seem to figure this out. Does anyone knows what’s going on?

Thanks,
Matt


spamd & systemd aren't agreeing on where the PID file is.

look at spamd argument list:
 /usr/bin/spamd --pidfile /var/run/spamd.pid

Note that "/var/run/" part.
Systemd is barking about not finding: "Can't open PID file 
/run/spamassassin.pid"

So either change spamd arguments or systemd spamassassin overrides.conf file so 
they agree on where the silly '.pid' file is going to live.


Note; do NOT change the spamassassin.service file (the next system update will 
overwrite your changes). Put your customizations in the 
/etc/systemd/system/spamassassin.service.d/override.conf file


Then make sure it actually ends up there.

--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Sudden surge in spam appearing to come from my email address

2023-07-14 Thread David B Funk



Assuming you own/manage your infrastructure it should be straight-forward.

Create SFP records for your domain & SMTP server, set them to either soft or 
hard fail mode.

If you can, also set up DKIM signing of your outgoing mail.

Then create rules that looks for your from address in a message and a meta 
which says "if from me & DKIM-fail/SPF-fail hit it hard"


If you can work with the SPF hard fail you will also help to improve your net 
reputation as spammers will have a harder time trying to "Joe Job" you.



On Fri, 14 Jul 2023, Thomas Cameron wrote:


All -

I am suddenly getting hammered by a BUNCH of spam that appears to be from me. 
It scores low, and even though I keep feeding it to Bayes, it's still not 
hitting the threshold to be marked as spam.


When I check the headers, it's coming from multiple random email servers, but 
many appear to originate from hotmail/outlook.com. So from outlook.com, 
through some unsecured email server, then to my server.


I'm trying to figure out how to block this stuff. Something like "if it 
appears to come from me, but it's not actually coming from my email server," 
block it. I don't necessarily think this is a job for SA, but if there's a 
rule I can tweak or a setting I can change, I'm all ears.


Thanks,
Thomas




--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Ensuring SPF/DKIM for @gmail.com

2023-07-25 Thread David B Funk



If you do that you will guarantee yourself to get bunches of spam that might 
otherwise be tagged by SA.


the "welcomelist" mechanism says:
 Anybody who matches this criteria we consider strongly not to be spam 
(regardless of how spammy all the other metrics may say it is).


You should "welcomelist" stuff that you want to guarantee passage of, regarless 
of all other considerations.


Given that Google:
a) SPF & DKIMs all the stuff that comes out of their system
b) has lots of spammers who have Gmail accounts and spew spam from them.
c) does not seem to care two hoots about (b) and lets (b) happen even in the
  case of reports.

So if you do those lines (or the more all-encompasing 'welcomelist_auth' form) 
you guarantee those spammers a free ride into your system.


Now if you want to find those critters that forge "n...@gmail.com" as a sender
you'll need to create a custom rule set:
1) a non-scoring rule that fires when from == "@gmail.com"
2) a 'meta rule' that says if-from-gmail && not DKIM_VALID then give 
it a spam score


DKIM_SIGNED is a standard SA rule that detects a properly valid DKIM or DK 
signature.



On Tue, 25 Jul 2023, J Doe wrote:


Hi,

I am currently using SpamAssassin 4.0.0 and I had a question on how I can 
ensure that any e-mail from @gmail.com has a valid SPF and DKIM signature.


I am aware that the following can be easily fooled, because it is not 
checking SPF and DKIM:


   welcomelist_from *@gmail.com

... so to ensure valid SPF and DKIM, I believe I would need:

   welcomelist_from_spf  *@gmail.com
   welcomelist_from_dkim *@gmail.com

... or *two* entries.

Is that correct ?

Thanks,

- J




--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Really hard-to-filter spam

2023-07-27 Thread David B Funk

On Fri, 28 Jul 2023, Jared Hall wrote:


On 7/27/2023 12:08 PM, Ken D'Ambrosio wrote:
Hey, all. I've recently started getting spam that's really hard to deal 
with, and I'm open to suggestions as to how to approach it. Superficially, 

[snip..]
The damn body's been encoded!  And there's so little in there that it's not 
triggering on many rules (e.g., Bayesian doesn't go over 20%).  If anyone 
has a bright idea -- maybe a way to decode the attachments and run a regex 
against _that_? -- I'm all ears.




1.  There are milters/content-filters that decode Base64 message parts 
(amavisd-new, mimedefang, etc) for processing by SA.
2.  There are still sufficiently unique items: First-Name-Only, Mixed-Case 
word in the Subject (NLP modeling), and a Base-64 encoded HTML attachment (w/ 
UTF-8 encoding no less).  Combined in a Meta rule, these innocuous items will 
likely hit with good accuracy even without Base64 decoding.


Umm, unless I'm really missing something here the usual SA processing decodes 
such body stuff (QP, Base64, etc) and feeds the "cleaned" text to the rule 
processing engine.


You have to work hard to get matches done on the raw stuff if you want to do 
special rule matching on the un-decoded body.



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Really hard-to-filter spam

2023-08-02 Thread David B Funk

On Wed, 2 Aug 2023, Thomas Cameron via users wrote:

Thank you very much. The message that slipped through today was NOT one of 
the ones being discussed in this thread, it was a different format and 
totally different message. I only included it to demonstrate that my server 
was not being rejected for queries as the blocked user intimated. I will dig 
deeper into the --magic and make sure I'm feeding Bayes with spam and ham.


Regardless, if a message has never been seen before and has little correlation 
to earlier messages its Bayes should hit someplace in the 40% to 60% range.


The fact that it hit 00% indicates a strong correlation to lots of ham (or 
something is screwy with your Bayes).



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: OT - Re: DNFTEC - was My apologies

2023-08-05 Thread David B Funk

On Sat, 5 Aug 2023, Grant Taylor via users wrote:


On 8/5/23 6:42 PM, Martin Gregorie wrote:

Yes given that he is


Sorry, I as asking for differences between Energy Creatures and Trolls.

I agree with your advice about the particular EC / T.

I'm still trying to understand the conceptual difference between an EC and a 
T or if they are synonyms for the same type of individual.


For the most part they can be pretty much interchangeable but slight shading:

EC -> alignment: neutral/chaotic
T -> alignment: evil

IE an EC can be unpredictable and occasionally positive but at a cost
T is pretty predictability undesirable

Just my U$0.02, YMMV

--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Scoring Explanation Please

2023-08-30 Thread David B Funk

Denny,

If you read the fine manual for the spamassassin configuration file, in section 
for 'score SYMBOLIC_TEST_NAME n.nn [ n.nn n.nn n.nn ]'


You'll see:

   If only one valid score is listed, then that score is always used for a test.

   If four valid scores are listed, then the score that is used depends on how 
SpamAssassin is being used. The first score is used when both Bayes and network 
tests are disabled (score set 0). The second score is used when Bayes is 
disabled, but network tests are enabled (score set 1). The third score is used 
when Bayes is enabled and network tests are disabled (score set 2). The fourth 
score is used when Bayes is enabled and network tests are enabled (score set 3).


So when there are four score values it will use the one relevant to your SA's 
operating condition.


EG: if the rule is senstive to the presence of network type tests, such as 
DNSRBLs, the score can be adjusted accordingly.



On Wed, 30 Aug 2023, Denny Jones via users wrote:


Hello,

I have looked high and low and can't find an explanation for multi-level 
scoring:

score SCC_CANSPAM_2    3.799    0.001    3.799    0.00

What does this mean?

In my simplistic way of doing things I would write this as:

score SCC_CANSPAM_2 3.799

Thanks for helping clear the mud in my mind!

Denny






--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Order of handling whitelist/blacklist

2024-03-28 Thread David B Funk

On Thu, 28 Mar 2024, Philip Prindeville via users wrote:





On Mar 28, 2024, at 2:39 AM, Matus UHLAR - fantomas  wrote:

On 27.03.24 20:56, Philip Prindeville via users wrote:

I have something that looks like:

whitelist_from_rcvd v...@yandex.ru vger.kernel.org

blacklist_from *@yandex.ru

And I only ever seem to see the 2nd rule being hit, but not the first.



[snip..]



My config also has:

trusted_networks 192.168.6.0/24
trusted_networks 192.168.8.0/24
trusted_networks 127.0.0.1/32

So I don't think that's the problem.

What are some steps to troubleshoot how the white/black-listing is happening?


whitelist_from_rcvd requires SA to 'see' the envelope from address.
Depending on how you have SA glued into your MTA that may not be happening and 
may require particular configurations.


Try creating an entry for a known good address and see if it fires.

If that source properly DKIM or SPF signs its messages it may be easier to use 
'whitelist_auth' instead of whitelist_from_rcvd.


It's also less maintenance headache as whitelist_from_rcvd must have the proper 
DNS names of their exit-point SMTP servers and in Cloud land that can change 
with out notice.


--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Question about sa-updates

2024-06-21 Thread David B Funk

On Sat, 22 Jun 2024, Paul Schmehl wrote:


  On Jun 22, 2024, at 12:28 AM, Kenneth Porter  
wrote:

On 6/21/2024 8:56 PM, Paul Schmehl wrote:
  I scratched my head, then looked up the man page for sa-update on the 
web. Sure enough, that’s where the rules
  go. Is that where my local.cf file should be located? Right now it’s in 
/etc/mail/spamassassin. There’s a default
  local.cf file in /var/lib/…..


/var/lib/spamassassin is where channels put their rules. /etc/mail/spamassassin 
is where the host admin puts her
customizations. I like to use separate files for different policies, named 
after each effect I'm trying to get. SA will load
anything there with a .cf extension.

It’s not clear to me from your answer. Does SA read rules in both places? Or 
only in /etc/mail/spamassassin/? 



Reading the "man" page documentation for spamassassin, it lists several 
different directories that SA looks for its config files in and the order that 
it reads them from.


The possible directories are distro and version specific so you need to read the 
docs for your specific instance.



--
Dave Funk   University of Iowa
 College of Engineering
319/335-5751   FAX: 319/384-05491256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: How to tell if DnsBlocklists are definitely being used by my Spamassassin setup

2015-11-30 Thread David B Funk

On Mon, 30 Nov 2015, Sebastian Arcus wrote:


On 30/11/15 16:41, Reindl Harald wrote:


Am 30.11.2015 um 17:24 schrieb Sebastian Arcus:

OK - this might be a basic question, but recently the detection rate on
my SA install has been really unreliable, so I decided that the first
step is to be sure it is using the public dns blocklists and razor. My
setup:

1. Spamassassin 3.4.1
2. I have Bind configured as recursive, non-forwarding, caching DNS 
server.

3. spamassassin --lint doesn't return any errors or failures.
5. My init.pre contains "loadplugin Mail::SpamAssassin::Plugin::URIDNSBL"

Here is the report included in one of the emails which is spam, but
wasn't detected as such:

Content analysis details:   (1.4 points, 5.0 required)

   pts rule name  description
   --
--
  -0.7 RCVD_IN_DNSWL_LOW  RBL: Sender listed at
http://www.dnswl.org/, low
  trust
  [212.227.15.41 listed in list.dnswl.org]
   1.0 SPF_SOFTFAIL   SPF: sender does not match SPF record
(softfail)
   0.0 HTML_MESSAGE   BODY: HTML included in message
  -0.1 DKIM_VALID Message has at least one valid DKIM or DK
signature
   0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not
necessarily valid
  -0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature
from author's
  domain
   1.3 RDNS_NONE  Delivered to internal network by a host
with no rDNS
   0.0 UNPARSEABLE_RELAY  Informational: message has unparseable
relay lines


Does the above mean that the DNSBL tests were applied, but returned zero
values - or would it mean they were skipped. I'm not sure how to find
out which one is it? I'm happy to attach some sample emails which
weren't detected, or any other useful info. Thank you


RCVD_IN_DNSWL_LOW is the opposite of "returned zero values" but why not 
just pass a sample against SA in debug-mode?


spamassassin -D  < /path/to/spam-example.eml
Thank you Harald. I did - and it looks like SA does contact lots of DNSBL's 
and it receives various messages in reply. Nothing that looks like failures 
or errors. I can attach the output here - but it is a lot. Would this mean 
that the DNSBL's are working correctly in my setup - but spammers somehow 
manage to keep on sending from "clean" domains all the time - and I should 
look into some other way of stopping this type of spam? The messages I'm 
talking about are typical spam, with one or two sentences in the email body 
and one or two links - usually advertising life insurance, solar panels and 
similar. None of them are from proper companies or entities I have ever dealt 
with.


I don't see any references to Bayes there, are you running Bayes and is it 
trained?

These "snowshoe" spams are a bit difficult to nail because they keep hopping
around. After a day or two they're listed in various RBLS (both for the IP and
URL hostname) but they rarely sit still long enough for that to help much.
They often have similar characteristics so Bayes can be a big help there.

Are you running RAZOR? It works sort of like a remote Bayes but needs to
be fed and like URIBLS may lag several hours and so not help on an inital flood.

I have my own in-house DNSbls (RBL, URIBL, NSRBL) that I hand feed based upon
spamtrap hits. One thing that helps is a NSRBL (IE a list of NS servers for the
URIs, urifullnsrhssub) that I list the registrars that spammers often get their
domain names from. It has to be used carefully as legit businesses also use
these cheap registrars but when used with METAs for things like BAYES it
helps.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: how to fix this issue-spam

2016-02-04 Thread David B Funk

On Thu, 4 Feb 2016, Reindl Harald wrote:


DMARC is a combination of SPF and DKIM plus From: header spoofing check.
You must get SPF and DKIM setup before adding the '_dmarc' DNS record for
the sending domain


tell me something new

wait i tell you something (for you) new: DMARC and mailing-lists is a awful 
topic - what do you think would have happened with you mail to the list if 
your domain would enforce DMARC and my MX reject mails violating the policy?


It's true that forwarding/maillists mangle SPF however unless the list does the 
idiocy of subject munging (pre-pending '[list-name]' to the subject) DKIM should

pass thru lists unscathed.

So if your DMARC policiy is to accept on SPF -or- DKIM success then you should
have no worries. If you demand SPF -and- DKIM then good luck to you.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: VERY_LONG_REPTO_SHORT_MSG

2016-02-26 Thread David B Funk

On Fri, 26 Feb 2016, Bowie Bailey wrote:


On 2/26/2016 12:46 PM, Antony Stone wrote:

On Friday 26 February 2016 at 18:14:53, Axb wrote:


On 02/26/2016 06:04 PM, John Hardin wrote:

On Fri, 26 Feb 2016, Reindl Harald wrote:

score VERY_LONG_REPTO_SHORT_MSG 3.999 3.999 3.999 3.999
header__VERY_LONG_REPTO Reply-To =~ /[^\s\@]{20,}\@/

Reply-To: malgorzata.warmin...@oranet.pl

very long?
20 chars?
4 points?
seriously?

that needs to be lower scored or 20 raised to much higher values

OK, set to 25 and limit 3.5

This rule is definitely bad.
A lot of euro languages have domains with a ton of chars.
imo, a lame excuse of a rule.

my LOUD -1 for this kind of exercise.

And another from me (40 chars in my address, for example).


antony.st...@spamassassin.open.source.it


Take another look at that regex.  It's not matching domains.  The match has 
to be followed by an @, so it is matching the user part of the address.


FWIW, the VERY_LONG_REPTO_SHORT_MSG rule has not hit anything at all on my 
server in the last month.


We had to tune that rule down quite a while ago. When you have an institutional
system which generates e-mail addresses based upon transliterated first-lastname
and have an international user community (including Latinos, people from 
the middle-east or asian-Indians) you end up with addresses such as:


chethyaupalakxyz-ranasin...@uiowa.edu
hernan-nabucolevaxyzreirafrei...@uiowa.edu
ammarsahibabdulameer-xyzhaf...@uiowa.edu

So we see regular FPs on that rule (say 5~10 per month)

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: PDF files containing executables?

2016-03-03 Thread David B Funk

On Thu, 3 Mar 2016, Marc Perkel wrote:

A customer of mine inquired about executable viruses inside of PDF files. Is 
that so? And if it is - is there any way of detecting executables inside of 
PDF?


I don't know that PDFs can contain classical ".exe" type executables but they
can clearly contain 'active content' (javascript, flash, etc) which can be
abused as a malware delivery vehicle.
So for practical purposes PDFs can be considered potential virus containers.

AV scanners have rules for detecting malware inside PDFs but that's always a 
catch-up game.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: PDF files containing executables?

2016-03-03 Thread David B Funk

On Thu, 3 Mar 2016, Dianne Skoll wrote:


On Thu, 3 Mar 2016 13:27:18 -0800 (PST)
John Hardin  wrote:

[Dianne Skoll]


However, many legitimate PDF files contain Javascript snippets.
Blocking solely on that basis will lead to many FPs.



I'd argue the "legitimate" part of that statement... :)


Well, maybe, but I think you'd lose that argument if you had to proved
service to the clients we do.


Sounds to me like it should be: block any PDF with
javascript/flash/java with whitelisted bypass.


If we did that, we'd have hundreds of support tickets pouring in... trust
me on this.  At least wrt Javascript.  Not sure about Flash and I had no
idea Java could be embedded in PDF... are you sure that's even possible?


I didn't think that a pure ".exe" could be embedded in PDF until I ran accross
this little gem: http://blog.didierstevens.com/2010/03/29/escape-from-pdf/
(not sure if that vulerability is still there, but people hang onto old systems
for a looong time...)


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: PDF files containing executables?

2016-03-03 Thread David B Funk

On Thu, 3 Mar 2016, John Hardin wrote:


On Thu, 3 Mar 2016, Dianne Skoll wrote:


On Thu, 3 Mar 2016 13:03:44 -0800
Marc Perkel  wrote:


Thanks for the response. I'm in the spam filtering business and I'm
wondering what I can use (from the command line?) to detect if a PDF
has any kind of script attached that would be executable. that way I
might block based on what's embedded in a PDF.


There are tools.  Google is your friend.

However, many legitimate PDF files contain Javascript snippets.  Blocking
solely on that basis will lead to many FPs.


I'd argue the "legitimate" part of that statement... :)


Many editable PDF forms use javascript for input validation, like most of the 
PDF forms you can download from irs.gov. (I'm not going to get in an argument 
with you about how "legitimate" the IRS is ;)


Sounds to me like it should be: block any PDF with javascript/flash/java with 
whitelisted bypass.


What sane MTA accepts bare executable attachments from the Internet at large 
any more? The same policy should apply to PDFs.


Don't tell me you've never seen HTML e-mail with embedded javascript?
Some content creators think that e-mail should be a full-fledged HTML page.

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: DOS_OUTLOOK_TO_MX and fp

2016-03-04 Thread David B Funk

On Fri, 4 Mar 2016, Alex wrote:


Hi,

I have a legitimate mail that received 2.8 points, making it spam, as
a result of what appears to be a false positive with DOS_OUTLOOK_TO_MX

http://pastebin.com/dbm2Q4k6

There doesn't seem to be any desktop system involved, just direct
communication with the sender's service provider. Is this the cause?

Is it possible this rule has a problem, or perhaps just the score is too high?

Thanks,
Alex


Usual mail flow is:
  sender-PC -> sender-ISP-MSA -> (zero or more intermediate MTAs -> )
recipient-MX-MTA -> (zero or more internal recipient-MTAs -> ) delivery system

Assuming the message started at a sender's PC you should see 2 or more external 
network transmission operations before it gets in the recipient-MX-MTA system. 
(IE two or more systems listed in the "X-Spam-RelaysUntrusted" header.)


If the message originated in a server (EG a MSP) rather than a PC, you might 
only see one external network transmission (but a bit uncommon for legit mail).


Looking at your pasetbin example, I see only one system in the 
"X-Spam-RelaysUntrusted" header and ALSO a "X-Mailer: Microsoft Office Outlook 
12.0" header. (which implies that the message originated on a PC, who runs 
Outlook on a server?).


So the inference is that a PC handed the message directly to the recipient's 
MX-MTA, thus the firing of that rule.


I'm not trying to defend the score value, just saying that the rule firing seems 
reasonable (IE doesn't look like a FP).


The one way that it could be a FP is if the sender's emitting MTA/MSA system 
deliberately stripped off all transport information to hide the original source.
That seems like a rare case. If it were common (I'm not aware of having seen it 
before) the masscheck scoring process should have driven down the score of that 
rule.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Missed spam, suggestions?

2016-03-07 Thread David B Funk

On Mon, 7 Mar 2016, Charles Sprickman wrote:


I’ve been running with some daily training for a little over a week and I’m 
seeing less spam in my inbox.  I’ve seen a few things slip through because 
bayes tipped them below the default score, these were two phishing emails.

Here’s some rule stats for anyone interested:

TOP SPAM RULES FIRED

RANKRULE NAME   COUNT %OFRULES %OFMAIL %OFSPAM  %OFHAM

  1 TXREP   13171 8.47   40.38   91.00   72.91
  2 HTML_MESSAGE12714 8.18   38.98   87.85   90.80
  3 DCC_CHECK   10593 6.81   32.48   73.19   33.78
  4 RDNS_NONE   10269 6.60   31.48   70.955.63
  5 SPF_HELO_PASS   10070 6.48   30.87   69.58   23.41
  6 URIBL_BLACK  9711 6.25   29.77   67.101.58
  7 BODY_NEWDOMAIN_FMBLA 9550 6.14   29.28   65.981.64
  8 FROM_NEWDOMAIN_FMBLA 9483 6.10   29.07   65.521.36
  9 BAYES_99 8486 5.46   26.02   58.631.18
 10 BAYES_9998141 5.24   24.96   56.251.06

TOP HAM RULES FIRED

RANKRULE NAME   COUNT %OFRULES %OFMAIL %OFSPAM  %OFHAM

  1 HTML_MESSAGE16473 9.13   50.51   87.85   90.80
  2 DKIM_SIGNED 13776 7.64   42.24   13.81   75.93
  3 TXREP   13228 7.33   40.56   91.00   72.91
  4 DKIM_VALID  12962 7.19   39.74   11.93   71.44
  5 RCVD_IN_DNSWL_NONE   9941 5.51   30.488.08   54.79
  6 DKIM_VALID_AU8711 4.83   26.717.99   48.01
  7 BAYES_00 8390 4.65   25.721.84   46.24
  8 RCVD_IN_JMF_W7369 4.09   22.592.54   40.62
  9 RCVD_IN_MSPIKE_WL6713 3.72   20.584.39   37.00
 10 BAYES_50 6201 3.44   19.01   25.56   34.18



Based upon your stats it looks like you need more Bayes training. 
Your Bayes 00/99 hits should rank higher in the rules-fired stats and BAYES_50 
shouldn't be in the top-10 at all.

(of course if you've only been training for a week that would explain it).

For example, here's my top-10 hits (for a one month interval).

TOP SPAM RULES FIRED
--
RANKRULE NAME   COUNT  %OFMAIL %OFSPAM  %OFHAM  S/O
--
   1T__BOTNET_NOTRUST   114907   60.32   86.81   42.66  0.5755
   2BAYES_99109138   32.98   82.450.01  0.9998
   3BAYES_999   104903   31.70   79.250.01  0.
   4HTML_MESSAGE9085079.41   68.63   86.59  0.3456
   5URIBL_BLACK 9084527.61   68.630.27  0.9942
   6T_QUARANTINE_1  9064027.40   68.470.02  0.9996
   7URIBL_DBL_SPAM  7915224.02   59.790.17  0.9956
   8KAM_VERY_BLACK_DBL  7430122.45   56.130.00  1.
   9L_FROM_SPAMMER1k7366722.26   55.650.00  1.
  10T__RECEIVED_1   7241342.60   54.70   34.54  0.5135

OP HAM RULES FIRED
--
RANKRULE NAME   COUNT  %OFMAIL %OFSPAM  %OFHAM  S/O
--
   1BAYES_00182674   56.032.11   91.97  0.0150
   2HTML_MESSAGE171992   79.41   68.63   86.59  0.3456
   3SPF_PASS136623   63.08   54.52   68.78  0.3457
   4T_RP_MATCHES_RCVD   130879   53.75   35.54   65.89  0.2644
   5T__RECEIVED_2   125492   53.76   39.62   63.18  0.2947
   6DKIM_SIGNED 114808   38.579.72   57.80  0.1008
   7DKIM_VALID  105385   34.707.16   53.06  0.0825
   8RCVD_IN_DNSWL_NONE  9295129.904.56   46.80  0.0609
   9T__BOTNET_NOTRUST   8474160.32   86.81   42.66  0.5755
  10KHOP_RCVD_TRUST 8462326.442.19   42.60  0.0331

Note how highly BAYES 00/99 ranked. What you don't see is that BAYES_50 is way 
down in the mud (below 50 rank).


BTW, this is with a Bayes that is mostly fed via auto-learning. I occasionally
hand feed corner cases that get mis-classified (usually things like phishes, or 
conference announcments that can look shakey).



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/ce

Re: Missed spam, suggestions?

2016-03-08 Thread David B Funk

On Tue, 8 Mar 2016, Matus UHLAR - fantomas wrote:

On Mar 8, 2016, at 7:31 AM, Matus UHLAR - fantomas  
wrote:

how can these two stats be different?



On 08.03.16 10:19, @lbutlr wrote:

Because one is for SPAM and one is for HAM.


On Mar 8, 2016, at 10:41 AM, Matus UHLAR - fantomas  
wrote:

Why did you remove the important part?


On 08.03.16 11:16, @lbutlr wrote:

I didn’t.


yes, you did, so I've had to paste them again below:


TOP SPAM RULES FIRED

RANK	RULE NAME	COUNT %OFRULES %OFMAIL %OFSPAM 
%OFHAM


  2	HTML_MESSAGE 	12714	  8.18	 38.98	 87.85 
90.80


TOP HAM RULES FIRED

RANK	RULE NAME	COUNT %OFRULES %OFMAIL %OFSPAM 
%OFHAM


  1	HTML_MESSAGE 	16473	  9.13	 50.51	 87.85 
90.80



Why did the same rule hit 38.98% of all mail and 50.51% of all mail?


Because on is checking SPAM and on is checking HAM.


so why was %OFMAIL different from %OFSPAM in the first case and from %OFHAM
in the second case?


seems that the mail counts were different, but why?


Because there are differing amounts of SPAM and HAM?


if we are only checking spam mail for a given rule, how can be number of
all hits different than number of spam hits? they all should be spam,
shouldn't they?


Assuming that the OP was using Dallas Engelken's "sa-stats.pl" script
(I was) then the report line for each rule (excepting the first column)
should be IDENTICAL.

This script takes as input a spamd's log output. It then aggregates a digest
of all the rule hits. In a given log report there will be lines that are
spam results ("spamd: result: Y 75") and lines that are ham results ("spamd: result: 
. -3").
For each line (spam & ham) there will be a list of the rules that fired on that 
particular message:


2016-03-08T12:37:44.833847-06:00 s-l107 spamd[10463]: spamd: result: . -3 - 
BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,KHOP_RCVD_TRUST,L_LOCAL_MUCHO_DOT_LINES2,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,RP_MATCHES_RCVD,SPF_PASS,T__RECEIVED_1 
scantime=3.5,size=11059,user=redacted,uid=115,required_score=6.0,rhost=s-l012.engr.uiowa.edu,raddr=128.255.17.253,rport=35620,mid=,bayes=0.00,autolearn=ham 
autolearn_force=no


So for the HTML_MESSAGE rule, I get stats of:
grep HTML_MESSAGE sa-stats-dec.out
   4HTML_MESSAGE9085079.41   68.63   86.59  0.3456
   2HTML_MESSAGE171992   79.41   68.63   86.59  0.3456

This means that of all the messages processed (for the duration of that log run) 
that rule hit %79.41 of all messages processed, %68.63 of the lines classifed as 
spam (a count of 90850 and resulting in a  rank of 4) and %86.59 of the lines 
classifed as ham (a count of 171992 resulting in a rank of 2).


Thus for a given rule, the %all-messages, %spam %ham should be IDENTICAL.
(assuming they are from the same log run).

So for the OP's original post, having %spam %ham be identical but %all-messages 
being different is weird. Now it could be that he's got a different version of

the sa-stats script, it has an addtional field, that "%of-rules" thing.

So to Charles Sprickman, which sa-stats script did you use to generate your 
rules report?



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

sa-stats log analyzer (RE: Missed spam, suggestions?)

2016-03-10 Thread David B Funk

That's the output from Dallas Engelken's "sa-stats.pl" log analyzer.
You feed it a segment of your spamd logs and it gives you
those rule hit statistics.

See: http://wiki.apache.org/spamassassin/StatsAndAnalyzers

Looking at that wiki page, I noticed that the copy available is v0.93.
I've got v1.03
Does anybody know what was the newest one last avaialable on the rulesemporium 
site? Anbody got something newer than v1.03?


I've done a bit of hacking to my copy (such as adding the S/O ratio stats).


On Thu, 10 Mar 2016, Erickarlo Porro wrote:



I would like to know how to get these stats too.

 

From: Robert Chalmers [mailto:rob...@chalmers.com.au]
Sent: Tuesday, March 08, 2016 5:25 AM
To: users@spamassassin.apache.org
Subject: Re: Missed spam, suggestions?

 

Can I ask, how are you getting these stats please?

 

Thanks

  On 8 Mar 2016, at 05:11, David B Funk  
wrote:

 

On Mon, 7 Mar 2016, Charles Sprickman wrote:


  I’ve been running with some daily training for a little over a week and 
I’m seeing less spam in my
  inbox.  I’ve seen a few things slip through because bayes tipped them 
below the default score, these
  were two phishing emails.

  Here’s some rule stats for anyone interested:

  TOP SPAM RULES FIRED

  RANK RULE NAME    COUNT %OFRULES %OFMAIL %OFSPAM  
%OFHAM

   1 TXREP   13171   8.47   40.38  91.00  72.91
   2 HTML_MESSAGE    12714   8.18   38.98  87.85  90.80
   3 DCC_CHECK    10593   6.81   32.48  73.19  
33.78
   4 RDNS_NONE    10269   6.60   31.48  70.95   
5.63
   5 SPF_HELO_PASS     10070   6.48   30.87  69.58  
23.41
   6 URIBL_BLACK    9711    6.25   29.77  67.10   
1.58
   7 BODY_NEWDOMAIN_FMBLA    9550    6.14   29.28   
65.98   1.64
   8 FROM_NEWDOMAIN_FMBLA    9483    6.10   29.07   
65.52   1.36
   9 BAYES_99     8486    5.46   26.02  
58.63   1.18
  10    BAYES_999   8141    5.24   24.96  56.25 
  1.06

  TOP HAM RULES FIRED

  RANK RULE NAME    COUNT %OFRULES %OFMAIL %OFSPAM  
%OFHAM

   1 HTML_MESSAGE    16473   9.13   50.51  87.85  90.80
   2 DKIM_SIGNED    13776   7.64   42.24  13.81  
75.93
   3 TXREP   13228   7.33   40.56  91.00  72.91
   4 DKIM_VALID  12962   7.19   39.74  11.93  
71.44
   5 RCVD_IN_DNSWL_NONE    9941    5.51   30.48   8.08  
  54.79
   6 DKIM_VALID_AU  8711    4.83   26.71   7.99   48.01
   7 BAYES_00     8390    4.65   25.72   
1.84   46.24
   8 RCVD_IN_JMF_W   7369    4.09   22.59   2.54   40.62
   9 RCVD_IN_MSPIKE_WL     6713    3.72   20.58   4.39  
  37.00
  10    BAYES_50     6201    3.44   19.01  
25.56  34.18


Based upon your stats it looks like you need more Bayes training. Your Bayes 
00/99 hits should rank higher in the
rules-fired stats and BAYES_50 shouldn't be in the top-10 at all.
(of course if you've only been training for a week that would explain it).

For example, here's my top-10 hits (for a one month interval).

TOP SPAM RULES FIRED
--
RANK    RULE NAME   COUNT  %OFMAIL %OFSPAM  %OFHAM  S/O
--
  1    T__BOTNET_NOTRUST   114907   60.32   86.81   42.66  0.5755
  2    BAYES_99    109138   32.98   82.45    0.01  0.9998
  3    BAYES_999   104903   31.70   79.25    0.01  0.
  4    HTML_MESSAGE    90850    79.41   68.63   86.59  0.3456
  5    URIBL_BLACK 90845    27.61   68.63    0.27  0.9942
  6    T_QUARANTINE_1  90640    27.40   68.47    0.02  0.9996
  7    URIBL_DBL_SPAM  79152    24.02   59.79    0.17  0.9956
  8    KAM_VERY_BLACK_DBL  74301    22.45   56.13    0.00  1.
  9    L_FROM_SPAMMER1k    73667    22.26   55.65    0.00  1.
 10    T__RECEIVED_1   72413    42.60   54.70   34.54  0.5135

OP HAM RULES FIRED
--
RANK    RULE NAME   COUNT  %OFMAIL %OFSPAM  %OFHAM  S/O
--
  1    BAYES_00    182674   56.03    2.11   91.97  0.0150
  2    HTML_MESSAGE    171992   79.41   68.63   86.59  0.3456
  3    SPF_PASS 

Re: Abused accounts

2016-03-15 Thread David B Funk

On Tue, 15 Mar 2016, Kris Deugau wrote:


Robert Boyl wrote:

Hi, everyone

Please check http://pastebin.com/GUBqpyZ8

Interesting how some spams that abuse some legit account such as this
one are hard to detect, how Spamassassin scores almost nothing although
there are spammy works, etc. System caught DCC_CHECK 1.10.

Some other systems such as isnotspam.com  caught
some SA rule which doesnt exist anymore in latest SA...
AXB_X_FF_SEZ_S=3.10.

[snip..]


I'm not certain, but it also looks like you might not be using Bayes.
This is likely one of the key methods of detecting spam like this;
since it was sent through outlook.com the message structure is perfectly
legitimate so IP DNSBLs will have little value.


It looks like that message was sent from a compromised college account hosted in 
Office-365.


Actually this is one case where Bayes may not be a help. Our campus recently
outsourced almost all users to O365. As a consequence our Bayes gets a
-lot- of ham from O365 and therefore has most of its fingerprints tagged as ham.
Thus it takes a very spammy message passed thu O365 to get anything but
BAYES_00.

IE, out of the 130KB of that message, only a few dozen bytes is actually the 
spam 'payload' and thus Bayes wise gets swamped by the O365 noise.


I'm considering tagging most of the O365 headers with bayes_ignore_header.
Anybody else wrestling with this? Any suggestions?


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: BODY_URI_ONLY is broken

2016-03-25 Thread David B Funk

On Sat, 26 Mar 2016, Reindl Harald wrote:


BODY_URI_ONLY Message body is only a URI in one line of text

how can that hit the (anonymized) mail below?
___

Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

** =C3=9Cbermittlung:  in ***=
***From:*** **=C3=9Cberpr=C3=BCfen Sie bitte den Artikel unter f=
olgender URL:http://example.com/administra=
tor/index.php?option=3Dcom_k2&view=3Ditem&cid=3D1832">Artikel =C3=BCberpr=
=C3=BCfen

lign=3D"left" class=3D"key">Array  ***


Because that is one long line that has been broken up for shipment using QP 
encoding (those '=' at the end of each part). Before doing body checks SA 
decodes all MIME text components (EG Base64, QP, etc).


So as far as the SA body rules are concerned that -is- only one line.

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: BODY_URI_ONLY is broken

2016-03-25 Thread David B Funk

On Sat, 26 Mar 2016, Reindl Harald wrote:



Am 26.03.2016 um 04:21 schrieb Reindl Harald:

Am 26.03.2016 um 03:54 schrieb David B Funk:

On Sat, 26 Mar 2016, Reindl Harald wrote:


BODY_URI_ONLY Message body is only a URI in one line of text

how can that hit the (anonymized) mail below?
___

Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

** =C3=9Cbermittlung:  in ***=
***From:*** **=C3=9Cberpr=C3=BCfen Sie bitte den Artikel
unter f=
olgender URL:http://example.com/administra=
tor/index.php?option=3Dcom_k2&view=3Ditem&cid=3D1832">Artikel
=C3=BCberpr=
=C3=BCfen

lign=3D"left" class=3D"key">Array  ***


Because that is one long line that has been broken up for shipment using
QP encoding (those '=' at the end of each part). Before doing body
checks SA decodes all MIME text components (EG Base64, QP, etc).

So as far as the SA body rules are concerned that -is- only one line


* it is *not* an URI only
* with that logic *any* message with a link would hit that rule
* the message has a headline and a table

hit that rule is plain wrong


stats of the whole month:

110 hits total
108 clear ham hits (BAYES_00)
1 false positive - the mail above - and flagged because of that
1 spam hit with 17 points, so it did not matter

1.0 points is way too much for a rule which hits prcatically only ham


At our site that rule has a S/O ratio of 0.9714 (in one month spam=1564, ham=46)
which easily warrents a 1.0 point score. It doesn't hit a lot of messages
(rank score of 245 for spam, 509 for ham) but mostly hits spam.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: HEADS-UP: MIME_NO_TEXT matches Sendmail MIME DSNs

2016-03-29 Thread David B Funk

On Tue, 29 Mar 2016, Bill Cole wrote:


On 29 Mar 2016, at 19:36, John Hardin wrote:

So, a message that's explicitly multipart MIME but which has only one part? 
Or does it actually have multiple parts, just none are marked as 
text/plain?


multipart/report; type=delivery-status. The standard MIME delivery status 
notification structure. 2 very similar RFCs on it. Has 3 (or 4?) parts. All 
except the first part (which IS text/plain; charset=us-ascii but in 
Sendmail's case has no Content-Type header saying so) have proper 
message/[thisandthat] CT headers.



Can you send me some samples?


Probably. Tomorrow. Afternoon. When I can spin up a bullshit VM (what still 
uses sendmail with a default workingish config?) or sanitize examples made 
via real stuff.


OR: if you can submit mail through a Sendmail instance, send mail to any bad 
address anywhere on any machine running any MTA, all it has to do is say '5yz 
blah blah we hate you' to some part of your attempt to send mail.  On any 
machine with a working classical Sendmail-managed mail subsystem you can just 
do 'echo "foo" |mailx -s 'any subject' 
nonexist...@non-local.but.existing.domain' and get one of your very own for a 
bogus address of your choice delivered to /var/spool/mail/yournamehere or 
somewhere like that. Unless your Sendmail is configured to not send MIME 
DSNs. In that case, fire your sysadmin.


I tried your experiment (sent mail to "no-such-user...@hotmail.com" ), got the
DSN, fed it to SA and didn't see any hits on MIME_NO_TEXT. Saw a hit on
T_TVD_MIME_NO_HEADERS but that has no score.

Now my original message was a CT: text/plain.
Maybe if the original message had no textural components at all it might fire as
you describe but I think it would be an unusual message to have no text, html, 
etc
at all.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Rule to score word documents

2016-03-30 Thread David B Funk

On Wed, 30 Mar 2016, Alex wrote:


Hi,


I'd like to assign a spamassassin score to received word documents
(doc,docx,xls,xlsx) so they are quarantined on my UTM. I've tried the
following which doesn't work. Can someone show me an example that should
work?

mimeheader DOC_ATTACHED Content-Type =~ /doc/i
describe DOC_ATTACHED email contains a DOC file attachment
score DOC_ATTACHED 12.5


If you're just going to block them outright, you'd probably be better
served doing it in your MTA. Assuming you're using postfix?

/^(Content-(Type|Disposition)\:|[[:space:]]+).*(file)?name="?.*\.doc"?;?$/
REJECT

I believe something like this would work in spamassassin:

mimeheader DOC_ATTACHED Content-Type =~ /="[^"]+\.(?:docx?|rtf)"/i
scoreDOC_ATTACHED 12.5


This may catch some documents but MS products key almost entirely on the file 
name extension.


So the content type header may be "application/octet-stream" and totally missing 
a 'name=' component but if there's also a Content-Disposition header that has a 
'filename=' component it will trigger file opening behaivor.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Macro virus fun

2016-04-06 Thread David B Funk

On Wed, 6 Apr 2016, Alex wrote:


Hi,

On Wed, Apr 6, 2016 at 3:12 AM,   wrote:

Alex skrev den 2016-04-06 02:40:


http://pastebin.com/FTzbQcHb

The Heuristics.OLE2.ContainsMacros rule is added by amavisd+clamav,
but it's apparently not something that spamassassin can manipulate


change clamd to block this mail, or score this with highter score in
amavisd, but blocking only make sense if you use amavisd-milter so it would
reject if it contains macros, here i just use clamav-milter not amavisd

its not spam, its really malware, handle is so is suggested


This one may be spam/malware, but the vast majority of them are not.
Blocking all files with macros is an obvious solution, but not a good
one.

Is it even possible to use SA to create a rule based on whether it
contains an attachment that has macros? At least then we could create
more aggressive meta rules.


FWIW,

Your example hits on the Sanesecurity custom ClamAV defs (specifically 
Sanesecurity.Badmacro.Doc.objl.UNOFFICIAL).


I have two instances of ClamAV running;

 One with just the stock defs from ClamAV which I use in a front-end milter to 
outright SMTP-reject any detected viri.


 The second has all the algorithmic, PUAs, etc bells-&-whistles activated plus
a full set of 3'rd party "unofficial" defs (Sanesecurity, winnow, bofhland,
etc) that is just used thru the SA Clamav.pm plugin.
That adds a custom 'X-Spam-Clamav' header to the message that contains the name
of the def that fired. I then have SA rules to score against based upon that.

So for example, "Sanesecurity.Badmacro" can be used to trigger a rule
to hit messages which need to be quarantined, etc.

You could create a custom ClamAV def that would look for any kind of macro
inside the various popular documents (.doc, .rtf, .pdf, etc) (ClamAV is good
at knowing how to unpack/scan attachments, so use it as a scanning engine).
You could the craft special handling based upon the detection of said macros.
(delivery time quarantining etc).


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Bayes duplicate message detection algorithm?

2016-05-13 Thread David B Funk
What algorithm does Bayes use to detect that it has already 'seen' a given 
message?


When I receive a bolus (say 40~60) of 'phish' messages from a compromised 
Hotmail/gmail/yahoo account which are mostly the same (body, many headers same,

only recipients, Message-ID, Date, and a few Received headers are different)
if I feed all of them to Bayes, it will learn only about 10% of them, the
other 90% will be ignored as 'already seen'.

So how does Bayes decide that it has 'already seen' a given message when
it actually hasn't (it has already seen one that is -almost- identical).

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

smime.p7s
Description: S/MIME Cryptographic Signature


Re: SA cannot block messages with attached zip

2016-05-20 Thread David B Funk

On Fri, 20 May 2016, Dianne Skoll wrote:


On Fri, 20 May 2016 09:31:48 +0300
Emin Akbulut  wrote:


What do you suggest to fight these spams?


ClamAV is basically useless.

We do it the hard way.  We list the contents of attached archives
(using "lsar") and have filename-extension rules that block .js
inside .zip files.  While this can lead to some FPs, which we handle
with selective whitelisting, it's very effective at catching the
latest crop of cryptolocker-style attacks.



But isn't this exactly what the "foxhole_all.cdb" 
(http://sanesecurity.com/foxhole-databases/) signatures do?

(or am I missing something?).

I see that they have a "high" risk of FPs but if you are using them as a 
scoring component within SA you should be able to "temper" those results

with other SA rules such as selective use of whitelist_auth.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: PHP eval()'d code

2016-05-26 Thread David B Funk

On Thu, 26 May 2016, John Hardin wrote:


On Thu, 26 May 2016, Reindl Harald wrote:




Am 26.05.2016 um 20:50 schrieb RW:


 I noticed that Bayes is picking-up on very strong tokens from "eval" and
 "code" in headers like this:


X-PHP-Originating-Script: 1013:global.php(1938) : eval()'d code


 The "eval()'d code" part is in just over 2% of my spam, but it's
 never occurred in a single ham in my corpus.

 The spams seem to be coming from exploited web-servers, and I'm
 wondering if it might be a symptom of the exploit


looks like worth a rule to add points


I've asked for samples and will add a rule based on that.


FWIW,
There's a varient of that in the "KAM.cf" ruleset from March of this year.
(Look for __KAM_BADPHP1, which is meta'ed into KAM_BADPHP)

It doesn't hit a lot of stuff (only 0.08% ) but does have a high S/O (0.9984) in
my mail stream (over the last 2 months).


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

smime.p7s
Description: S/MIME Cryptographic Signature


Re: Advice: why one relay evaluated and not the other

2016-06-08 Thread David B Funk

On Wed, 8 Jun 2016, jimimaseye wrote:




On 08/06/2016 16:05, Matus UHLAR - fantomas [via SpamAssassin] wrote:
  note that if a server acts as your MX, it should be listed in
  internal_networks, no matter if other company manages it.

  That applies for backup MX servers for your domain, or, even primary MX if
  you fetch mail from it e.g. via pop3.

Mathaus, as this thread has shown, can you explain why adding this range as an 
internal network hasnt made a difference (and it is still
being checked)?


This sounds like a config file confusion issue. IE the SA that you are running 
is looking at different config files than the ones that you are editing

or some config file that is being read -after- your expected config files
is over-riding your internal_networks settings.

Try running SA with the '--debug' option to see the explicit list of
config files that it is reading. Make sure that it's reading yours and
look at the ones that come after yours to make sure that none of them
have a "clear_internal_networks" directive.
Be sure the user/environment that you test in is the same that is used during
the processing of messages.

Silly question, is some meta-framework involved in your system (EG amavis, 
etc..)



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

smime.p7s
Description: S/MIME Cryptographic Signature


Re: SA bayes file db permission issue

2016-06-09 Thread David B Funk

On Thu, 9 Jun 2016, Yu Qian wrote:


Yes, I am sure the path is correct, also, if the path is not correct, it will 
show 'db not present'.
I tried to write a small perl script to open the db file, it failed too. so I 
think it maybe the file damaged during the mounting. but I
don't know why this can happen

---
Yu Qian
Ottawa Ontario
Phone: (514)-553-0198



On Thu, Jun 9, 2016 at 4:24 PM, John Hardin  wrote:
  On Thu, 9 Jun 2016, Yu Qian wrote:

My spam assassin works pretty well if I run it on a single machine, 
either
mac or linux. that means I update my rules and train my bayes model 
on the
same machine.

But when I tried to train the model and generate bayes file db  on 
mac, and
I mounted them to a docker container, then sa-learn failed to read 
the DB.
the permission looks good, because the error just show "failed to 
open
bayes_toks"

Anyone know the potential problems?



Check the version number of the BerkekeyDB libraries on the two different
machines. There are binary-data compatability issues between some of the
versions. (EG a db file created by v3.0 cannot be opened by v4.2 IIRC).

You may have to do a bayes "-backup" on the one system and a "-restore"
on the other.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: why does that mail not get any bayes-classification

2016-06-10 Thread David B Funk

On Sat, 11 Jun 2016, Reindl Harald wrote:




Am 10.06.2016 um 23:52 schrieb RW:

On Fri, 10 Jun 2016 16:57:45 +0200
Reindl Harald wrote:


see attachemnt, no bayes tag at all looks like a major bug somewhere


In the absence of any debug it's hard to say.


hence i attached the sample


It is possible for no tokens to make it through the selection, in which
case there is no result. That's more likely than normal in your case
since you don't train on headers.


if you would have looked at the message you would have seen that there is 
content and not only headers and it looks like the message has just incorrect 
mime-definitions (missing end headers)


since thunderbird shows the attachment as well as the mail content that would 
be a way for spammers to completly trick out SA


There may be a bug but I don't it is in the SA distro.

I took your sample and fed it to my SA kit. First time thru it hit BAYES_50, I
then did a "sa-learn --spam < /tmp/ignored_by_bayes_stripped.eml" and retested 
it. It then hit BAYES_999.


So I'd say standard SA + Bayes works on that message. Somebody at your site may
have done some modifications to your SA that is causing you problems.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

smime.p7s
Description: S/MIME Cryptographic Signature


Re: local uribl is not called

2016-06-13 Thread David B Funk

On Mon, 13 Jun 2016, Reindl Harald wrote:


* the syntax seems to be correct
* domain listet and dig answers correctly on the sa-machine
* spamassassin -D < sample.eml 2> out.txt
* grep for the uribl don't show any call

uridnsbl   URIBL_LOCAL  uribl.thelounge.net.  A
body   URIBL_LOCAL  eval:check_uridnsbl('URIBL_LOCAL')
describe   URIBL_LOCAL  Contains an URL listed in the URIBL blacklist
score  URIBL_LOCAL  0.1
tflags URIBL_LOCAL  net domains_only


with that two variants errors appear in the maillog while i don't get what's 
wrong with tell the return-code here - anyways, that confirms that the rule 
above seems not to be wrong


Jun 13 00:19:17 mail-gw spamd[5953]: config: SpamAssassin failed to parse 
line, "URIBL_LOCAL uribl.thelounge.net. A 127.0.0.2" is not valid for 
"uridnsbl", skipping: uridnsbl URIBL_LOCAL uribl.thelounge.net. A 127.0.0.2


Jun 13 00:20:03 mail-gw spamd[5953]: config: SpamAssassin failed to parse 
line, "URIBL_LOCAL uribl.thelounge.net." is not valid for "uridnsbl", 
skipping: uridnsbl URIBL_LOCAL uribl.thelounge.net.


Did you "--lint" check the rules before you tried testing them?

That 'SpamAssassin failed to parse line' error sounds like you've got a syntax 
error in there.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

smime.p7s
Description: S/MIME Cryptographic Signature


Re: how to write body rules to match 'tortured html' variations of text phrases?

2016-06-15 Thread David B Funk

On Thu, 16 Jun 2016, RW wrote:


On Wed, 15 Jun 2016 13:40:25 -0700 (PDT)
John Hardin wrote:


On Wed, 15 Jun 2016, jaso...@mail-central.com wrote:



and all the possible line-broken and "="-delimited variations?
There's obviously a lot of them.


That would have to be a rawbody rule


AFAIK QP is decoded even in the rawbody.



That is correct, you need to use 'full' rules (which come before "rawbody") to
get the undecoded (really-raw) message.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: USER_IN_WHITELIST

2016-07-06 Thread David B Funk

On Wed, 6 Jul 2016, Lorenzo Thurman wrote:


I’ve been receiving some spam where spamassassin identifies the sender with 
USER_IN_WHITELIST. These senders (or domains) are
most definitely not in my whitelist. How can I get around this problem?Thanks



SpamAssassin comes with some built-in whitelists (which should be pretty safe to
use). Look in your SA rules kit for things like: 60_whitelist.cf 
60_whitelist_dkim.cf and 60_whitelist_spf.cf

You might also have some 3'rd party rules files that contain whitelists.

You can explicitly negate the effect of an entry from one of these files by
using the appropriate "unwhitelist_from" type configuration statements in your
local.cf config files.

Theoretically you could edit the system config files but those edits could be
lost with the next system rules update, so using the unwhitelist_from technique
is the way to go.

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

smime.p7s
Description: S/MIME Cryptographic Signature


Re: SPF should always hit? SOLVED

2016-07-11 Thread David B Funk

On Mon, 11 Jul 2016, Reindl Harald wrote:



Am 11.07.2016 um 19:30 schrieb RW:

[snip..]

It sounds like SA is not able to parse the envelope sender out of the
headers.

See the description for envelope_sender_header in
man Mail::SpamAssassin::Conf


SA has also a weakness or design mistake here

"envelope_sender_header X-Local-Envelope-From" while that header comes from 
postfix with customized configuration because we use it in own rules has no 
fallback

__

By default, various MTAs will use different headers, such as the following:

   X-Envelope-From
   Envelope-Sender
   X-Sender
   Return-Path
__

well, in case of "envelope_sender_header" present in the configuration and 
that header is missing for whatever reason there is *no fallback* while for 
most cases it would be better to use "envelope_sender_header" as prefered one 
instead the only one


that it is not the case can you see when "add_header all Status _YESNO_, 
score=_SCORE_, tag-level=_REQD_, block-level=8.0, envelope=_SENDERDOMAIN_, 
from=_AUTHORDOMAIN_, _TOKENSUMMARY_" ends with SENDERDOMAIN_ in your headers


The SA Conf man page seems to indicate that it -should- fall back to its 
heuristic if the envelope_sender_header is missing:


   To avoid this heuristic failure, the "envelope_sender_header" setting may be 
helpful.  Name
   the header that your MTA or MDA adds to messages containing the address used 
at the MAIL
   FROM step of the SMTP transaction.

   If the header in question contains "<" or ">" characters at the start and 
end of the email
   address in the right-hand side, as in the SMTP transaction, these will be 
stripped.

   If the header is not found in a message, or if it's value does not contain an 
"@" sign,
   SpamAssassin will issue a warning in the logs and fall back to its default 
heuristics.

It doesn't look like that fall-back is working. If you completely omit the 
envelope_sender_header config setting, the heuristic works.

Maybe you should file a bug-report.

One additional question, if you're setting the envelope_sender_header config
why aren't you actually supplying it?

If you cannot depend upon your system to actually supply the header you list
in your envelope_sender_header config, then don't set that parameter.

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

smime.p7s
Description: S/MIME Cryptographic Signature


Re: question about filtering spam

2016-07-19 Thread David B Funk

On Tue, 19 Jul 2016, Jan-Kees van Kampen wrote:


Hi John,


It would be better if you could post a few spamples to something like
pastebin or a webserver you control and send the URLs to the list so
that we can see the complete raw messages.


here are 3 examples:

http://sandberg.nl/sp/

1 and 2 have the pattern I described before,
3 is another one
I didn't see such an obvious pattern,
but I don't know how to tackle that one neither ...

thanks,

Jan-kees


FWIW all three of those messages came from sources that are on multiple
IP-based block-lists (DNSBLs) such as spamhaus.net, spamcop.net, & abuseat.org.

If you were using those methods for filtering (either via postfix filtering or
SA scoring) those messages shouldn't have made it thru your filtering system.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: new Mail-SpamAssassin-Plugin-AttachmentPresent

2016-09-06 Thread David B Funk

On Tue, 6 Sep 2016, Alex wrote:


Hi,


Is there any ability to determine if a particular attachment has a
Word macro enclosed in addition to just having a Word document?


that's the hob of clamav and the sa-plugin for it

"OLE2BlockMacros yes" in case of a scored SA plugin won't block but add the
score of that clamd-instance, for unconditional block of other things you
typically have a calmd-instance with different config running as
unconditional milter


Yeah, that's unacceptable to me.

I can't accept obscuring whether a particular attachment has a macro
virus and instead just be notified only that it has a macro. That's
effectively saying it's necessary to outright block all macros or risk
allowing attachments with macro viruses to be passed unencumbered.

I was looking for another way to link macros with spamassassin, as the
amavisd/clamd approach is broken.


The reality of the world is:
1) block/quarantine/encumber/tag all documents that have a macro.
2) allow them thru unencumbered and risk delivering documents that might have a 
macro virus.


I assume that you already have an AV that will block/quarantine -known- 
macro viruses.


You say "that's unacceptable to me"
What is 'acceptable' to you? Unless you find some magical prescient anti-virus 
that can accurately predict all possible macro viruses with out FPs I don't know 
what else can be done.




--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: new Mail-SpamAssassin-Plugin-AttachmentPresent

2016-09-06 Thread David B Funk

On Tue, 6 Sep 2016, Dianne Skoll wrote:


On Tue, 6 Sep 2016 17:50:25 -0400
Alex  wrote:

[snip]

  Workbook_Open
  Document_Open
  Auto_Open
  AutoOpen



Is there a simple way to identify whether the attachment/macro
contains those listed functions, without the ability to use
mimedefang?


Not that I know of, though you could write a SpamAssassin plugin, I suppose.

Our algorithm simply searches for those strings in an Office documents if
macros were detected.  The newer docx, xlsx, etc. variants are simply
zip files in disguise, so we pipe those through "unzip -p"

While a document could contain macros, and contain one of those strings
just by coincidence, we judged the margin of error to be good enough for
our purposes.

All in all, it's fiddly, tedious, and requires a fair bit of Perl programming.
It's also quite resource-intensive, so make sure you have the CPU horsepower.


There's already a set of "Sanesecurity" 3'rd party signatures designed to detect
bad stuff in M$ document files (Excell/Word macros, etc) (called 
'badmacro.ndb').
I would assume this set of patterns could be incorporated into those sigs (but I
don't have enough experience doing this kind of thing to know for sure.)

It's pretty straight-forward to connect a ClamAV scanning instance to SA using 
the ClamAV plugin. I run two ClamAV instances, one with just the official sigs 
as a MTA blocking milter and the second with all kinds of 3'rd party sigs as a 
spam-scoring engine for SA.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: AW: X-Spam Tagging - Spam Status YESNO Flags - Sometimes not appended...

2016-09-16 Thread David B Funk

What do you see in your syslog reports from spamc?
Is it reporting any errors?

Please note the 'max-size' parameter for spamc:

  -s max_size, --max-size=max_size
  Set the maximum message size which will be sent to spamd -- any bigger 
than
  this threshold and the message will be returned unprocessed (default: 500 
KB).
  If spamc gets handed a message bigger than this, it won't be passed to 
spamd.
  The maximum message size is 256 MB.

So any message larger than that parameter (default 500KB) will be silently
bypassed as far as spamd processing is concerned.

Note, do not make that a large number in an attempt to process -everything- 
unless you have a beefy (lots of RAM & CPU) machine for your spamd processing.




On Fri, 16 Sep 2016, Maik Linnemann wrote:


SA is integrated into postix via master.cf like:

==
# service type  private unpriv  chroot  wakeup  maxproc command + args
#   (yes)   (yes)   (yes)   (never) (100)
# ==
smtp  inet  n   -   -   -   -   smtpd
  -o content_filter=spamassassin

spamassassin unix -   n   n   -   -   pipe
 user=nobody argv=/usr/bin/spamc -f -e /usr/sbin/sendmail -oi -f 
${sender} ${recipient}

i dont have milter or amavis in place. av scanning is done in the backend, 
where i would say the tagging should already be done (and mostly its the 
case!)


Von: li...@rhsoft.net [li...@rhsoft.net]
Gesendet: Freitag, 16. September 2016 15:52
An: users@spamassassin.apache.org
Betreff: Re: X-Spam Tagging - Spam Status YESNO Flags - Sometimes not 
appended...

Am 16.09.2016 um 14:49 schrieb Maik Linnemann:

So far so good. The concept works like it should with only one
exception: Some mails are not tagged by spamassassin and i dont have a
clue why. Viscerally i would say its about 20% of all mails that arent
tagged by spamassassin


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Custom rule based on AWL score

2016-10-20 Thread David B Funk

On Thu, 20 Oct 2016, John Hardin wrote:


On Thu, 20 Oct 2016, Ian Zimmerman wrote:


On 2016-10-20 08:34, simplerezo wrote:


My understanding is that AWL is helping frequent senders who are known
to not send spam to "reduce" their spam score, preventing false
positive. That's exactly what I want to rely on for my rules: adding
score for mail with "invoice" pretention and an attachment but only
for very unknown users (or spammers).


Just add your custom rules globally, with reasonable scores.

Whitelisted senders get a _huge_ bonus (I think it's 100 points by
default, maybe customizable), so they won't be affected if you do it
right.


ITYM  -100 points. :)

Small but important detail... :)


which is why I like the "dev_whitelist*" variety. They have a value of -7.5
(instead of that -100 sledgehammer) which is usually enough to get legit mail 
thru but not enough to swamp out a major rules hit on real spam (which happens 
to get issued by the people you're trying to protect).


EG:
def_whitelist_auth *@nih.gov


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: version.h.pl show stopper

2016-11-18 Thread David B Funk

On Sat, 19 Nov 2016, Dan Jacobson wrote:


$ svn checkout http://svn.apache.org/repos/asf/spamassassin/trunk /tmp/ee
$ cd /tmp/ee
$ echo|perl Makefile.PL PREFIX=/tmp/g
$ make

In the end you will see

cd spamc
/usr/bin/perl version.h.pl
spamc/configure.pl: Can't exec `version.h.pl': No such file or directory at 
spamc/configure.pl line 74.
Makefile:1812: recipe for target 'spamc/Makefile' failed
make: *** [spamc/Makefile] Error 2

No?


Works for me. That `version.h.pl' file has a pound-bang header "#!/usr/bin/perl" 
which will fail if /usr/bin/perl doesn't exist or isn't executable.


What does "file /usr/bin/perl" return on your system?


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Weird Spamassassin startup behaviour on Ubuntu 16.10

2016-12-05 Thread David B Funk

On Tue, 6 Dec 2016, Michael Heuberger wrote:


Anyone?


On 23/11/16 16:11, Michael Heuberger wrote:

Hello folks

New here :)

I'm running Spamassassin v3.4.1 here on an headless Ubuntu 16.10 server 
together with Monit (and Postfix of course). Each time server restarts, 
Monit says first that the spamd process is not running (no PID) but in 
three minutes later it says the opposite, that it is back up running.


There is the /etc/init.d/spamassassin file that boots Spamassassin on 
start. But somehow it does not seem to get executed asap but 3 mins later. 
No idea why.


The PID under /var/run/spamassassin.pid is owned by root and is monitored 
under Monit with these configs:


# (hidden) @ (hidden) in /etc/monit/conf.d [14:00:04] C:1
$ cat spamassassin
check process spamd with pidfile /var/run/spamassassin.pid
   group mail
   start program = "/etc/init.d/spamassassin start" with timeout 180 
seconds

   stop  program = "/etc/init.d/spamassassin stop"
   if 5 restarts within 5 cycles then timeout
   if cpu usage > 79% for 5 cycles then alert
   if mem usage > 79% for 5 cycles then alert
   depends on spamd_bin
   depends on spamd_rc

check file spamd_rc with path /etc/init.d/spamassassin
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor

check file spamd_bin with path /usr/sbin/spamd
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor

Not sure what the problem is here. Could be one of these:
- a bug in /etc/init.d/spamassassin?
- a bad config in the above monit config
- something in postfix or another lib blocking spamassassin

No idea how to investigate. And there is no spamassassin log showing how it 
starts up.


Any clues welcome

Cheers
Michael


Could it be some kind if interaction with other system services startup?
(in particular this feels like a network timeout issue).

One of the things SA does during its startup process is check to see if 
DNS/network stuff is available.
If the system hasn't yet brought up the network stack when SA starts, it may 
hang waiting for the network to stabilize.


On a running system, if you stop/restart SA do you see the same delay or is 
it only on a cold start of the system?


Is it possible to configure a SA starup dependency on the network being up?

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Increase BAYES_99 score?

2017-01-13 Thread David B Funk

On Fri, 13 Jan 2017, Bill Cole wrote:


On 10 Jan 2017, at 10:55, Michael B Allen wrote:


bayes_file_mode 0777


Don't do that. Ever. It is not necessary, despite having been propagated 
widely as a supposed solution for system-wide Bayes permission issues. The 
clear indicator that whoever devised that was flailing in sheer ignorance is 
that it is 0777 instead of 0666: why would ANYONE need execute permission on 
a DB file???


The sane solution is to make sure everything that needs to write to the Bayes 
DB runs as the same user or as users which all have one group in common. The 
absolute loosest mode you should use is 0664, and that only if you do 
something like backups as an unprivileged user. If you can't be bothered to 
think about such security issues at least go with 0666 so it can't be 
subverted as a stealth executable.


And... if you read with comprehension the Spamassassin manual page for that 
attribute you will see:


   bayes_file_mode  (default: 0700)
   The file mode bits used for the Bayesian filtering database files.

   Make sure you specify this using the 'x' mode bits set, as it may also 
be used to create
   directories.  However, if a file is created, the resulting file will not 
have any execute bits set
   (the umask is set to 111). The argument is a string of octal digits, it 
is converted to a numeric
   value internally.


That "need execute permission" is for the directory not the DB file.
So -DO- use that 0700 (or if you must 0777 ).



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: List of trusted senders

2017-01-25 Thread David B Funk

On Wed, 25 Jan 2017, Benny Pedersen wrote:



same as with clamav 3dr party spam signatures should not really have being in 
clamav, but on a sa channel, i know there is a perl script to make that 3dr 
party sigs back to sa, but it uses so much memmory that its not practical :(


my solution to this is to use 2 clamd, and 2 clamav-milters, one of the 
milters reject virus, while the other just add headers from 3dr party clamav 
sigs, i hope clamav milter can have on_unofficial acccept so i dont need 2 
clamav-milters, and 2 clamd, when i reply on bugzilla clamav, maintainers 
just laffed of me, so got tirred of use time on that




WRT ClamAV & 3'rd party sigs; what about the spamassassin ClamAVPlugin?

I use that in my SA mix; still have to use an additional clamd instance to 
load/run the 3'rd party sigs but don't need the extra clamav-milter.
Also I can create SA rules that score based on the results of the ClamAVPlugin 
run.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: New whitelisting trick using from and spf

2017-03-06 Thread David B Funk

On Mon, 6 Mar 2017, Alan Hodgson wrote:


It seems it should be easy to setup “If mail claims to be From: PayPal.com
and is not from PayPal, score +100” but it is not.


This is what DMARC is for.

Run opendmarc as a milter and reject failures. Or score later on DMARC
failure, even if just selectively for highly phished domains.

PayPal publishes p=reject, on paypal.com at least, if not their other domains.



But that won't help you when the scammers set the user visible from as 
"acco...@paypai.com" or some other variant (with the actual address part as 
 or something else.


user-agents (such as OutHouse) by default only show the "comment" part of the 
address and hide the actual <> address part, making it easy for scammers to fool 
the non-tech savvy users.




--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: FREEMAIL_REPLYTO

2017-03-09 Thread David B Funk

On Fri, 10 Mar 2017, Michael Grant wrote:

[snip..]

The problem is caused by innocentbytan...@ymail.com IN THE BODY!  

This seems a bit overzealous.  It seems like a bit of an over-reach to look at 
headers in the BODY of the message.

This is an excellent rule except for this rude message body cavity search!

I suggest only searching the headers in this rule.

If you really feel it aught to search the body like this, can you please split 
it into 2 rules:
  1) the existing rule which searches the body+headers, and
  2) a second that only searches the headers.


It is not uncommon for spammers to embed a "contact me at my private address" 
line in the body of scam-mails (EG 914, or "I found some money and I'd like your 
help..." or "do you need a loan?") stuff.


Just searching for freemail systems in the From or ReplyTo headers by themselves
isn't as powerful as there are lots of Ham mails that have freemail From or 
ReplyTo.


So yes it is important to find those body addresses and check to see if they
match/NOT the "From:" address (that's its strength).

If you want to test this, there is a variant rule provided by the FreeMail
plugin which only checks the headers (check_freemail_from & 
check_freemail_header).
Just look for hits on FREEMAIL_FROM, you'll probably find it hits more Ham than 
Spam.


As you found there is the risk of FPs, so don't score this rule as a
one-shot-kill unless you're willing to accept the damage or have other
mechanisms to mitigate the damage.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Office-365 headers swamp Bayes

2017-03-15 Thread David B Funk
I'm having trouble with Bayes getting swamped by all the headers of Office-365 
generated mail messages.


Our campus has outsourced its Exchange mail servers to O-365 and migrated the 
bulk of user accounts to it. Thus a large percentage of mail our departmental 
server receives is from on-campus O-365 users (pretty much all of it ham).


The problem I have is that O-365 messages have a monstrous wad of headers which 
tend to swamp out body tokens so all O-365 messages (from local ham sources and 
remote spammers using O-365) tend to get BAYES_00 score.


I've added as many O-365 specfic headers as I can to "bayes_ignore_header" 
statements, but even with that for a test O-365 message, 2/3'rds of the tokens 
are from headers. (40 from headers, 21 from body).


Bottom line, are there config settings that allow adjustment of total tokens, 
tokens from headers vs tokens from body, etc.


Dave

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Can someone post some real-world examples of whitelist_auth, whitelist_spf, and whitelist_dkim?

2017-03-23 Thread David B Funk

On Thu, 23 Mar 2017, fitz wrote:


I am attempting to tighten up my whitelists, replacing whitelist_from with
whitelist_auth, whitelist_spf, and/or whitelist_dkim.  And having trouble.
The simplistic example of
 whitelist_auth b...@example.com  example.net
does not really cut it.

For example, I have the following headers:

Received-SPF: Pass (sender SPF authorized) identity=mailfrom;
client-ip=76.74.244.76; helo=outbound076.dcm8.com;
envelope-from=qd_pat_ba7cce6de305fce6b09be229f71e639fdebb287253d1e...@inbound.dcm8.com;
receiver=some...@bebop.com
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; s=key1;
d=inbound.dcm8.com;

h=Date:From:Reply-To:To:Message-ID:Subject:MIME-Version:Content-Type:Content-Transfer-Encoding:List-Unsubscribe;
bh=glCJ7SPuJhI+sBNWpIcLUzww974=;

b=xtADEde9s1pYTVT8IBwjLVjOiDNCjf8GY3vaqk7HmMMgRtOzRhRcGZkT+yeKNHwlIOk8iYD9Y6uX

mMrOwIYFJ1H5iX1hn5Mj+Pd3BTpdhxPDd0YUBbfvmoa/W7hj2plUYDtSKt5wGYU8GRjSNj7xK5zx
  juMZm6vlWkfFTwRdyM8=
DomainKey-Signature: a=rsa-sha1; c=nofws; q=dns; s=key1;
d=questdiagnosticssurvey.com;

b=mC5TtAPZBG0FwqfSaoAAFEn2hGO193KMoqpRbx/C3CmZ1KTfhcBz+9MsDi5z2dma4tkwLeGXYmMU

IyL3l2Y9bZD5MhpdA3daN8Z2o23QKgHFM7KHxfovtClAniOhoNDukdWhLAumDMlsmg4GG/iutulk
  TbSLKC7h4SYaWu/Y1js=;
Received: from parking.hostmonster.com (10.0.95.23) by outbound076.dcm8.com
(PowerMTA(TM) v3.5r15) id hqfm400lr5gd for ; Thu, 23 Mar
2017 15:39:28 + (envelope-from
)
Date: Thu, 23 Mar 2017 15:39:28 +
From: Quest Diagnostics 
Reply-To: Quest Diagnostics 

I have tried
 whitelist_(spf|auth|dkim) *@QuestDiagnosticsSurvey.com
(questdiagnosticssurvey.com | inbound.dcm8.com | outbound076.dcm8.com |
dcm8.com)
and none seem to work.  I get SPF AUTH and DKIM_VALID_AU but no
USER_IN_WHITELIST.

I have been able to get the whitelist_auth to work for gmail, comcast, and a
few other places, but this one does not seem to work using the same rules.

From WHERE is one supposed to pull the second parameter for these rules?


I think you are confusing whitelist_(spf|auth|dkim) with 
whitelist_from_received
The former only requires single addresses/address-patterns the latter requires 
pairs of configuration data.


EG for your example try:
  whitelist_auth sur...@questdiagnosticssurvey.com
  whitelist_spf *@inbound.dcm8.com

One slight potential point of confusion, whitelist_(spf|auth|dkim) allows for 
multiple addresses on one line, so it can look a little like 
whitelist_from_received which -requires- pairs of conf data but 
whitelist_(spf|auth|dkim) actuall works on single address/patterns.



FWIW, I personally like the "def_whitelist_*" form. The def_whitelist_*
varient only gives an addtional -15 score (instead of the -100 from the full 
varient). This usually gives the necessary boost to get mis-classified messages 
past filtering with out totally swamping nasty spam that sometimes gets emitted 
from ordinarily whitelisted sources. (EG when a whitehat business gets 
compromised or one of their staff gets phished).


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: MISSING_MIMEOLE and X-MimeOLE

2017-05-01 Thread David B Funk

On Mon, 1 May 2017, Alex wrote:


Hi,

On Mon, May 1, 2017 at 8:44 AM, David Jones  wrote:

From: Alex 


I've taken a more conservative, but also more time-consuming approach
by creating rules that subtract a few points with the right
combination.

I was also hoping there was a more general approach that would make
these rules with such high scores less prone to FPs in the first
place, or at least create a greater burden by default before adding
such high scores to rules involving just a regex.


*  3.3 MSGID_NOFQDN1 Message-ID with no domain name


This one catches even automated reports generated by HP to many of our
users, as well as a common email fax service. They just don't consider
proper RFC compliance in their shell scripts, and to basically turn it
into spam just for that is unreasonable.

Also unfortunately, they don't comply with SPF or DKIM conventions,
and one might argue simply passing SPF_PASS isn't sufficient for a
meta rule before whitelisting.


It's more time-consuming to maintain, but whitelist_from_rcvd lets you 
reasonably safely (safe from spoofing) whitelist a given sender that doesn't 
have DKIM/SPF.


(I'm partial to the "def_whitelist*" version of local whitelists because it will 
save good messages from quarantine but can be over-ridden by heavy-duty spam 
rules (such as malware being sent from a compromised Yahoo user's account).



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Today's Google Docs phish

2017-05-03 Thread David B Funk

On Wed, 3 May 2017, Alex wrote:


Hi,

If you haven't heard, there was a huge Google Docs phishing attack
today. Several hundred bypassed our filters in the hour or so before
we were able to identify them. The To address is always
"h...@mailinator.com" and the subject is always " has shared a document on Google Docs with you" where "user name"
is some random user.

https://www.theatlantic.com/technology/archive/2017/05/did-someone-just-share-a-random-google-doc-with-you/525279/

I wanted to provide an example in case it helps, even though chances
are the campaign is dead. We've seen Google proxy and redirect attacks
before and will probably see them again.

https://pastebin.com/aWVaMMni


[snip..]

The LOC_FRAUD_DOC is a local rule and the LOC_URI_RARE_TLD was for
'.pro' from John's rules some time ago. They're only scored at 0.6.

Obviously training these would be enough to put them over to spam, but
would someone like to look at the URI in the body to create a possible
rule? It's likely Google is looking at this more closely - do you
think they will put an end to the redirect that's being used?

Should the score for .pro domains and other rare TLDs be higher?

Have you received any of these? Have you done anything to prevent them
next time or from being received this time?


That target domain "g-docs . pro" was registered 12 days ago via namecheap.com
which was enough to earn it a few extra points at our site.

It's now sitting in a high-scoring local URIBL here (which is enough to get a 
SMTP-REJECT).


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: US-CERT message FP

2017-05-08 Thread David B Funk

On Mon, 8 May 2017, John Hardin wrote:


On Mon, 8 May 2017, Chris wrote:


I get various posts from US-CERT none so far have been tagged as spam
until today. The raw message with the SA tags is here - https://pastebi
n.com/f71A2FfW What it hit on was:

pts rule name  description
 -- -
 5.0 BOTNET Relay might be a spambot or virusbot
  [botnet0.8,ip=208.42.190.173,maildomain=ncas.us-
cert.gov,nordns]


That's a bit worrying.

...but that looks like a local rule, I can't find "BOTNET" by itself as a 
rule in SVN. Is it local? How is it defined?



[snip..]


How did ncas.us-cert.gov get classified as a botnet host?



"Botnet" is a SA plugin that was written several years ago by John Rudd which 
tries to look for spammyness clues derived from the DNS/hostname of the 
first untrusted relay. From the source code comments:


# Botnet - perform DNS validations on the first untrusted relay
#looking for signs of a Botnet infected host, such as no reverse
#DNS,  a hostname that would indicate an ISP client or domain
#workstation, or other hosts that aren't intended to be acting as
#a direct mail submitter outside of their own domain.

One of its heurisitcs is to look for signs of the IP address embedded in the 
hostname (EG looking for things like "client-201.240.187.107.speedy.net.pe")

as a sign of an infected PC doing direct mail delivery.

This fired on the host name of that site: mailer190173.service.govdelivery.com 
because part of its IP address [208.42.190.173] was found in the name.


Years ago I dropped the default Botnet score (5.0) way down because of FPs like 
this.


I'd be concerned with what caused the DKIM signature to fail validation.
(DKIM_SIGNED, T_DKIM_INVALID).
If something in the mail chain is breaking DKIM validation then attempts to use 
things like whitelist_auth are doomed to failure.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: US-CERT message FP

2017-05-08 Thread David B Funk

On Mon, 8 May 2017, Chris wrote:


whitelist_auth *@*.us-cert.gov us-cert.gov

This should be:

whitelist_auth *@*.us-cert.gov


I don't know why I keep putting the second entry in my 'my-
whitelist.cf' file. I must have read it or something a long, long time
ago in order to be doing this. 


Possibly got the format of whitelist_from_rcvd stuck in your brain. ;)

There is an optional second argument to whitelist_from_dkim which provides the 
domain of a third-party signatory.


EG:
 whitelist_from_dkim j...@example.com
vs:
 whitelist_from_dkim j...@example.net  example.org


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

block Bayes autolearn for specific messages

2017-05-10 Thread David B Funk
Is there any way to use Bayes autolearn in general but prevent it from learning 
specific messages?


I have a specific source of messages (Office-365) which I would like to prevent 
from being autolearn (with out scoring them as spam).


I still want those messages to be SA scored using the normal methods, just not 
be considered -at-all- for autolearning.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: block Bayes autolearn for specific messages

2017-05-10 Thread David B Funk

On Wed, 10 May 2017, John Hardin wrote:


On Wed, 10 May 2017, David B Funk wrote:

Is there any way to use Bayes autolearn in general but prevent it from 
learning specific messages?


I have a specific source of messages (Office-365) which I would like to 
prevent from being autolearn (with out scoring them as spam).


I still want those messages to be SA scored using the normal methods, just 
not be considered -at-all- for autolearning.


bayes_ignore_from u...@example.com

bayes_ignore_to u...@example.com


John,
Thanks for the suggestion but I still want Bayes classifier run on those 
messages, just no autolearning.


bayes_ignore_(to|from) prevents both.

I've already got a rule that adds a small score (0.3) to those messages but 
unfortunately they hit minus-score rules (EG: RCVD_IN_MSPIKE_*, 
KHOP_RCVD_TRUST, etc) often enough that they still get learned.


I could jack up the local score add but then I run the risk of FPing O365 
messages that don't hit the minus-score rules.


Is there some kind of score calculation rule that does something along the line 
of "if total score is less than N, add M"


Dave

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Negative rule score not working as expected

2017-05-10 Thread David B Funk

On Thu, 11 May 2017, Benny Pedersen wrote:


Anthony Hoppe skrev den 2017-05-11 00:55:

I'm trying to implement a very simple rule that looks at the
"Received" header(s) and if a string is found apply a negative score.
The rule is as follows:

headerAH_KNOWBE4  Received=~ /phishtest\.knowbe4\.com/
score AH_KNOWBE4  score -10.0


above line, remove 2nd score


describe  AH_KNOWBE4  Prevents KnowBe4 campaign emails from falling
into users Junk folders

The rule triggers as expected, but a score of 1 is applied as opposed
to the desired -10.  What am I doing wrong?

Thanks!


Why didn't "spamassassin --lint" bark about this syntax error?

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: URIBL_BLOCKED on 2 Fedora 25 servers with working dnsmasq, w/ NetworkManager service

2017-05-19 Thread David B Funk

On Fri, 19 May 2017, John Hardin wrote:


On Thu, 18 May 2017, Rob McEwen wrote:

In many cases, they explain to me that their settings got auto-overwritten 
by their hoster - who just HAD to switch their resolv.conf file back to 
8.8.8.8


cron. job.


Wouldn't the SA config parameter "dns_server" over-ride what's in the 
resolv.conf, or doesn't that work for RBL queries?


EG, set:
  dns_server 127.0.0.1

in your local.cf file and don't worry about what's in the resolv.conf


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Somewhat OT: DMARC and this list

2017-05-19 Thread David B Funk

On Fri, 19 May 2017, Dianne Skoll wrote:


Hi,

Tons of list traffic keeps getting quarantined because of DMARC.  For
example, a recent message from David Jones :

DMARC policy for domain ena.com suggests Rejection as
DMARC_POLICY_REJECT, but quarantined due to rule settings

$ host -t txt _dmarc.ena.com
_dmarc.ena.com descriptive text "v=DMARC1\; p=reject\; sp=reject\; 
rua=mailto:dm...@ena.net\;";

(In this instance, we've overridden the DMARC policy and converted it
to quarantine instead of reject, so I was able to retrieve the email, but...)

I'm pretty sure Mailman can do DMARC-munging.  Can ezmlm do the equivalent
of Mailman's "ALLOW_FROM_IS_LIST" feature?

Regards,

Dianne.


My read on this is that "@ena.com" is living dangerously. They publish SPF 
records and DMARC records (with p=reject) but do NOT DKIM sign their mail.


In general it's dangerous to expect SPF to work thru a maillist or other 
forwarder. Often DKIM will but you cannot count on it (particularly if the list 
engages in Subject munging).


If they're only going to use SPF then publishing a DMARC policy of "reject" is 
risky.

See: https://dmarc.org/2017/03/can-i-use-dmarc-if-i-have-only-deployed-spf/

Please let me know if I'm misinterpreting the signs.

Dave

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Somewhat OT: DMARC and this list

2017-05-19 Thread David B Funk

On Fri, 19 May 2017, RW wrote:


On Fri, 19 May 2017 14:13:22 -0500 (CDT)
David B Funk wrote:

ne.


My read on this is that "@ena.com" is living dangerously. They
publish SPF records and DMARC records (with p=reject) but do NOT DKIM
sign their mail.


Most of them pass DKIM, a minority aren't signed.


Urgg, I see that now. I looked at a few of David Jones' posts to this list and 
saw that they weren't DKIM signed, so I extrapolated that to a general 
asumption.


I see that they're using Office-365. This is one of the issues I have with 
0-365, it's a black box which is hard to second guess.

Sometimes they DKIM sign, some times they don't.
Sometimes they will score incoming messasge that are properly DKIM signed as 
spam (for no reason other than the DKIM signature, as far as I can tell).


Bottom line; If you put yourself at the mercy of Office-365, using a DKIM policy 
of "reject" is risky.




--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Somewhat OT: DMARC and this list

2017-05-19 Thread David B Funk

On Fri, 19 May 2017, David Jones wrote:


From: David B Funk 

 

On Fri, 19 May 2017, RW wrote:



On Fri, 19 May 2017 14:13:22 -0500 (CDT)
David B Funk wrote:

ne.


My read on this is that "@ena.com" is living dangerously. They
publish SPF records and DMARC records (with p=reject) but do NOT DKIM
sign their mail.


Most of them pass DKIM, a minority aren't signed.



Urgg, I see that now. I looked at a few of David Jones' posts to this list and
saw that they weren't DKIM signed, so I extrapolated that to a general
asumption.


They are DKIM signed so something must be striping the headers.


I see that they're using Office-365. This is one of the issues I have with
0-365, it's a black box which is hard to second guess.
Sometimes they DKIM sign, some times they don't.
Sometimes they will score incoming messasge that are properly DKIM signed as
spam (for no reason other than the DKIM signature, as far as I can tell).



Bottom line; If you put yourself at the mercy of Office-365, using a DKIM policy
of "reject" is risky.


I don't.  Our inbound to and outbound from Office 365 is handled by our
own mail servers that are properly DKIM signing.  I have been reviewing
DMARC reports for years now to make sure we had good SPF, DKIM and
DMARC before recently moving to p=reject.

Dave


I hate to break it to you but you are at the mercy of Office-365 and its erratic 
DKIM policy.


The message from you that I'm replying to here (both the one that came directly 
to me and the copy I got thru the  Apache list server) are -totally- devoid of 
DKIM headers. (If you'd like to see it I can put it up in paste-bin.)


Looking at some of your other posts to this list, many of them do have DKIM 
headers but not all. The interesting part is that the DKIM headers are 
interpolated with the O-365 headers so it looks like O-365 is taking your 
original message, stripping off the DKIM headers and sometimes re-adding them.


Good luck with this, welcome to the O-365 world.

Dave

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: ramsonware URI list

2017-07-15 Thread David B Funk

On Sat, 15 Jul 2017, Antony Stone wrote:


On Saturday 15 July 2017 at 11:19:54, mastered wrote:


Hi Nicola,

I'm not good at SHELL script language, but this might be fine:

1 - Save file into lista.txt

2 - trasform lista.txt in spamassassin rules:

cat lista.txt | sed s'/http:\/\///' | sed s'/\/.*//' | sed s'/\./\\./g' |
sed s'/^/\//' | sed s'/$/\\b\/i/' | nl | awk '{print "uri;RULE_NR_"$1";"$2"
describe;RULE_NR_"$1";Url;presente;nella;Blacklist;Ramsonware
score;RULE_NR_"$1";5.0" }' > listone.txt ;for i in $(sed -n p listone.txt)
; do echo "$i" ; done | sed s'/;/ /g' > blacklist.cf


If anyone can optimize it, i'm happy.


My first comment would be "useless use of cat" :)

My second comment would be that you can combine sed commands into a single
string, separated by ; so that you only have to call sed itself once at the
start of all that:

sed "s'/http:\/\///'; s'/\/.*//'; s'/\./\\./g'; s'/^/\//'; s'/$/\\b\/i/'"
lista.txt | nl .


Another observation/optimization; use the perl pattern-match separator character 
specifier to avoid delimiter collision. (EG "m!" ).


The following two regexes are functionally equivalent but one is easier to 
write/read:


  /http:\/\/site\.com\/this\/that\/the\other\//i

  m!http://site\.com/this/that/the/other/!i

Second one avoids the "Leaning toothpick syndrome" 
https://en.wikipedia.org/wiki/Leaning_toothpick_syndrome


Another way to use that data is to extract the hostnames and feed them into a 
local URI-dnsbl.
Using "rbldnsd" is an easy to maintain, lightweight (low CPU/RAM overhead) way 
to implement a local DNSbl for multiple purposes (EG an IP-addr based list for 
RBLDNSd or host-name based URI-dnsbl).
The URI-dnsbl has an advantage of being easy to add names (just 'cat' them on to 
the end of the data-file with appropriate suffix) and doesn't require a restart 
of any daemon to take effect.
Clearly it has a greater risk of FPs than a targeted rule that matches on the 
specific URL of the malware. However if the site is purpose created by blackhats 
to disseminate malware or a legitimate site that has been compromised and isn't 
being maintained then there's a high probability that it will be (ab)used again 
for other payloads. In that case blacklisting the host name gets all future 
garbage too.
IMHO: any site on that list with more than 3 entries or a registration age of 
less than a year is fair game for URIdnsbl listing.


Looking at that data there are clearly several patterns that could be used to 
create targeted rules.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: ramsonware URI list

2017-07-15 Thread David B Funk

On Sat, 15 Jul 2017, Antony Stone wrote:


On Saturday 15 July 2017 at 11:19:54, mastered wrote:


Hi Nicola,

I'm not good at SHELL script language, but this might be fine:

1 - Save file into lista.txt

2 - trasform lista.txt in spamassassin rules:

cat lista.txt | sed s'/http:\/\///' | sed s'/\/.*//' | sed s'/\./\\./g' |
sed s'/^/\//' | sed s'/$/\\b\/i/' | nl | awk '{print 
"uri;RULE_NR_"$1";"$2"

describe;RULE_NR_"$1";Url;presente;nella;Blacklist;Ramsonware
score;RULE_NR_"$1";5.0" }' > listone.txt ;for i in $(sed -n p listone.txt)
; do echo "$i" ; done | sed s'/;/ /g' > blacklist.cf

[snip..]

One observation; that list has over 10,000 entries which means that you're going 
to be adding thousands of additional rules to SA on an automated basis.


Some time in the past other people had worked up automated mechanisms to add 
large numbers of rules derived from example spam messages (Hi Chris;) and there 
were performance issues (significant increase in SA load time, memory usage, 
etc).

Be aware, you may run into that situation. Using a URI-dnsbl avoids that risk.

I see that list gets updated frequently. How quickly do stale entries get 
removed from it?
I couldn't find a policy statement about that other than the note about the 30 
days retention for the RW_IPBL list.
Checking a random sample of the URLs on that list, the majority of them hit 
404 errors.
If that list grows with out bound and isn't periodically pruned of stale entries 
then it will become problematic for automated rule generation.


I'm not saying that this isn't an idea worth pursuing, just be aware there may 
be issues.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Spam with tons of lines with garbage characters, preceded by

2017-07-19 Thread David B Funk

On Thu, 20 Jul 2017, Andrzej A. Filip wrote:


By default messages bigger than 500KB are not sent to spamd for
processing/scanning => the tactics you describe frequently "turns off"
spam filtering.

IMHO SA should design procedures to deal with big messages.
I personally use "sacan headers only" approach => it seems
to be a quite good first step.


That can be done in the "glue" that connects your mail system to SA.
In my milter I take in the first 'N' bytes (configurable) of the message, pass 
them to SA and then discard the rest (IE truncating the body of the message).
I had to code it to keep track of the MIME headers (if any) and fabricate a mime 
closing tag after the truncation point to maintain the logical integrity of 
the message.


Another way to do it would be to take a mime-aware filter (like mimedefang) and 
use it to strip off non-textural parts of the message to reduce it down in size 
and feed SA the parts that it actually looks at. This won't help if they embed 
insane amounts of garbage text (then only the truncation scheme will help) but 
will help with spam that has lots of images and junk.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: tflags

2017-08-03 Thread David B Funk

On Thu, 3 Aug 2017, Kris Deugau wrote:


Ian Zimmerman wrote:

On 2017-08-03 10:38, sha...@shanew.net wrote:


The most common ones that I make use of are "multiple" and "maxhits"
in order to allow a rule to be scored for each time it hits, but to
stop counting after some threshold.  I also use the "net" tflag so
that RBL checks only run when a net-based ruleset is loaded.


Where is the concept of "ruleset" in general documented, and in
particular what makes it "net-based"?  Not in Mail::SpamAssassin::Conf.



"Ruleset" is a somewhat fuzzy term that depends on context - it could refer 
to a single rule, a cluster of rules in a single file, a group of files, or 
"all active rules files".  It's not a formal definition within SpamAssassin. 
In this case it's referring to one rule - tflags are only set on a per-rule 
basis.


Any net-based rule is one that relies on a working Internet connection to do 
a data lookup - most commonly DNS lookups, but rules for eg Vipul's Razor 
(RAZOR_* rules), DCC, or Pyzor are also considered net rules since they do a 
lookup against a network service somewhere.


More to the point, if you look at the "spamd" documentation for the "-L" flag 
you'll see:


   -L, --local
   Perform only local tests on all mail.  In other words, skip DNS and 
other network tests.  Works the same as the

   "-L" flag to spamassassin(1).

So all "net-based" rules (as indicated by intrinsic coding or the tflags 'net') 
get ignored when running in --local mode.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Results of Individual Tests on spamd "CHECK"

2017-08-07 Thread David B Funk

On Mon, 7 Aug 2017, Jerry Malcolm wrote:


I'm invoking spamd using:

CHECK SPAMC/1.2\r\n


I'm getting the expected response such as:

Spam: False ; -1.8 / 4.0

I am trying to figure out how to get the TESTS= results of the individual 
tests returned as well.


(e.g.tests=[AWL=-1.103, BAYES_00=-2.599, 
HTML_MESSAGE=0.001,URIBL_BLACK=1.955, URIBL_GREY=0.25])
I see there's an option in spamc that appears to do that.  But I can't figure 
out how to make

that happen when I do a direct socket invoke of spamd.

Can someone tell me what I need to add to the spamd call (and the syntax) in 
order to get the

results of the individual tests returned as part of the status?

Thanks,

Jerry


Jerry,
the spamd 'CHECK' command just returns the status+score, nothing else.

the spamd 'REPORT' command returns the status+score and report.
So replace 'CHECK' with 'REPORT' in your spamd call. Then be ready to read an 
arbitrary number of additonal lines in the return connection.


Note that it will not return any part of the original message.
If you want to use any of the SA report features that add additional headers 
(such as the relays header) you will need to use a different spamd command: 
'HEADERS'.


BTW, I cannot tell from your posting if you have one detail correct; you need 
the command, (and any addtional optional arguments) then a blank line, then the 
message.


EG:

REPORT SPAMC/1.2\r\n
User: joe-blow\r\n
\r\n





--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Sender needs help with false positive

2017-08-07 Thread David B Funk

On Mon, 7 Aug 2017, Alex wrote:


Hi,

On Mon, Aug 7, 2017 at 6:56 PM, Jacek Osuchowski  wrote:

We use emails to allow users to reset their passwords to our website. We
send very brief emails containing the reset password. Example between :




Your password to access your account is:

S]U3bC7k

Upon successful login you may change your password by going to Modify
Account / Change Your Password.







* 3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
* 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%


You can't control their bayes training so there's nothing you can do here.


You -can- control the content of your message. I'm guessing that short
password reset message doesn't have very many tokens, and the ones that it does 
have may be too close a match to things like password phish spams. (something 
that we train heavily on).


Put more text in there that is related to your business/organization which will 
be unique and thus unlike other spammy message.






* 2.1 HTML_IMAGE_ONLY_12 BODY: HTML: images with 800-1200 bytes of words


Are you sending these emails as an image or text?

Do you have a text component to your message as well?


More to the point do you have an image attached/embedded in your message?
If so, either drop it altogether or add a few Kbytes of text to balance it out.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Sender needs help with false positive

2017-08-07 Thread David B Funk

On Mon, 7 Aug 2017, David Jones wrote:

[snip..]
This IP is listed on SORBS and Spamhaus ZEN which are going to cause problems 
with delivery to many receiving mail filters, not just SpamAssassin.


http://multirbl.valli.org/lookup/68.192.71.191.html



That's his PC which is the MSA. As it's the first hop, it's not surprising it 
hits Zen PBL (it should, given a host name like ool-44c047bf.dyn.optonline.net).


That shouldn't score against him except in broken SA installations.

His problem is the small amount of text that looks like a phish spam and the 
embedded image.




--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


RE: Sender needs help with false positive

2017-08-07 Thread David B Funk

On Mon, 7 Aug 2017, Jacek Osuchowski wrote:


This is an email I sent to IsNotSpam.com. They list the whole thing when 
testing for spam. I am getting a lot of complains from our customers that our 
emails are not received. Our domain is not blacklisted anywhere so I suspect it 
is the spam filtering (as IsNotSpam tool indicates). Is there anything in the 
email we send that could trigger flagging as a spam. THANK YOU

https://pastebin.com/J1cdCHAe



Try this experiment.
Take that same message, add two paragraphs of text describing your 
business/organization to the end and DELETE that embedded image.


Re-test and I'll bet that you get a passing score.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: TxRep can't use SQLBasedAddrList factory module

2017-08-15 Thread David B Funk

On Tue, 15 Aug 2017, Christopher Engelhard wrote:


On 08/14/2017 05:24 PM, Kevin A. McGrail wrote:

does mysql -u  -p localhost spamdb work?


Yes, that works. The user has INSERT, DELETE, UPDATE, SELECT privileges.
Does it need CREATE? The table 'txrep' exists with columns username,
email, ip, count, totscore, signedby.

The Bayes-related tables reside in the same DB, and those can be
accessed (though I've only tried it with amavis, not with pure spamd/spamc).

christopher


I've not looked at the TxRep code but some kinds of SQL operations need to be 
able to create temporary tables.


I'd start by giving it all perms (excepting things like GRANT), see if it works, 
and then scale back the perms until you find the minimal necessary set.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: In anyone else getting 325KB spams from cont...@cron-job.org?

2017-09-14 Thread David B Funk

On Thu, 14 Sep 2017, Dianne Skoll wrote:


On Thu, 14 Sep 2017 11:27:27 -0700
"Loren Wilton"  wrote:


Other than being obvious spam, they seem to be set up as though they
were legitimate commercial mailing list stuff, often containing
things like contact-id and the like in the links.



Is anyone else seeing these?


A small number.  The cont...@cron-job.org address is only in the From:
header; the envelope recipients look randomly-generated and sometimes
from unrelated domains.

Should be easy to block.  Just block the cron-job.org domain.


Not to mention that the target URL "proffbuilder DOT com" is listed in several 
URIBLs.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: ISIPP - Re: bb.barracudacentral.org

2017-09-19 Thread David B Funk

On Tue, 19 Sep 2017, Chris wrote:


On Wed, 2017-09-20 at 00:40 +0100, Martin Gregorie wrote:

On Tue, 2017-09-19 at 16:44 -0500, Chris wrote:



Thanks Martin, here's what I get, it appears to not be running.

sudo systemctl stop dnsmasq
[sudo] password for chris: 
Failed to stop dnsmasq.service: Unit dnsmasq.service not loaded.


OK, that makes sense
 


sudo systemctl disable dnsmasq
Failed to execute operation: No such file or directory


That's interesting: I've never seen that before:



[snip..]


It would be interesting to know what 'systemctl status' shows on your
system, though its quite possible it looks similar to what 'systemctl
disable' showed. I can only guess that your system is a transitional
systemd setup, i.e. systemctl is used for service management but some
services (dnsmasq for one) are still running under the old systemV
init
scripts. Fedora installations used to work that way for some
services,
but that was a few versions ago (F21 or 22 at the latest).


Martin
 

Hi Martin, here's what I see:

sudo systemctl status dnsmasq
[sudo] password for chris: 
● dnsmasq.service
   Loaded: not-found (Reason: No such file or directory)
   Active: inactive (dead)
chris@localhost:~$ sudo systemctl enable dnsmasq
Failed to execute operation: No such file or directory
chris@localhost:~$ sudo systemctl status dnsmasq
● dnsmasq.service
   Loaded: not-found (Reason: No such file or directory)
   Active: inactive (dead)

I then installed dnsmasq (apparently it wasn't installed)

Results are here - https://pastebin.com/MRR4NCMp


dnsmasq was already there (see your own previous posts) just not put there via 
the "apt" package management system. Thus "apt" didn't know about the rogue 
dnsmasq process, and it failed to start the newly installed one.
(as the rogue dnsmasq process was already there, running, and bound to the DNS 
socket).


So now you have -two- dnsmasq kits, one installed by "apt" and managed thru the 
"systemctl" tools, and another one that somebody put there which is outside the 
realm of "apt" & "systemctl" (thus they don't know how to manange it).


You should really pick one method of installing/managing software and stick with 
it.


This is similar to the mess you get when you mix CPAN with yum/yast/rpm/apt for 
installing Perl modules.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: OT - Hotmail/Outlook.com marking most of our email as Junk

2017-09-20 Thread David B Funk

On Wed, 20 Sep 2017, Rupert Gallagher wrote:


> 10. The emails we send are operational and notices emails to customers - 
who need them. They call on the phone and complain they haven't received 
them - just to discover they were sent, but ended up in the junk. 

Tell them to send you a copy of the header, then look for clues in their 
anti-spam report. 


Good luck with that.
Have you ever seen the kind of stuff that M$ adds to 
Hotmail/Outlook.com/Office365 etc.. messages?


Then when you try to track down any info on how to iterpret the dense pile of 
stuff in a 'x-forefront-antispam-report' header you run into this page:

https://technet.microsoft.com/en-us/library/dn205071(v=exchg.150).aspx

Note the paragraph:

 After accessing the message header information, search for
 X-Forefront-Antispam-Report and then look for these fields. Other fields in
 this header are used exclusively by the Microsoft anti-spam team for diagnostic
 purposes.

IE, we're not tellin..

Having been in the same situation as the OP (Done the full Monty monkey dance, 
MX, DKIM, SPF, abuse@, etc) the only thing that I can say is it's all VouDoo.



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Bank fraud phish

2017-10-24 Thread David B Funk

On Tue, 24 Oct 2017, Rupert Gallagher wrote:


Easy one. The Message-ID is not well formed / RFC compliant. We reject such 
junk upfront. 

Sent from ProtonMail Mobile


On Tue, Oct 24, 2017 at 8:32 PM, Alex  wrote:
  Hi all, I'm wondering if someone has some ideas to handle bank fraud 
phishing emails, and in particular this one:
  https://pastebin.com/wxFtKK16 It doesn't hit bayes99 because we haven't 
seen one before, and txrep subtracts points.
  It also doesn't hit any blacklists. Ideas for blocking these, and more 
general advice for blocking banking fraud/phish
  attacks would be appreciated.


I'm sorry, what RFC does that message-id fail to comply with?
It's of the form :

 "Message-ID: "

Looks darned correct to me.
It's a bit on the long side but I've seen worse and is still not too long.

The fact that there's folded-whitespace in there is totally permissable as long 
as done correctly, which it looks like it is.




--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Bank fraud phish

2017-10-24 Thread David B Funk

On Tue, 24 Oct 2017, Pedro David Marco wrote:


Out of curiosity...

"account is deactivated due to inactive,"  

is this correct in english? shouldn't it be "inactivity"?


It isn't good English, but I've seen worse from official notices.

Now the fact that it claims to be a US financial company being served from a 
South African website with a cPanel SSL certificate which has a ONE MONTH life 
span is darned fishy.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: Your header "To: undisclosed-recipients:;" is RFC 822 compliant

2017-10-27 Thread David B Funk

On Fri, 27 Oct 2017, A. Schulze wrote:




Am 27.10.2017 um 07:15 schrieb @lbutlr:

RFC 822 is obsolete, replaced by RFC 2822.

... which is obsoleted by RFC 5322 and updated some other RFCs
see https://tools.ietf.org/html/rfc5322


And it still explicitly says that construct is legal:
rfc5322:3.4

   ...   This is done by giving a display name for the group,
   followed by a colon, followed by a comma-separated list of any number
   of mailboxes (including zero and one), and ending with a semicolon.
   Because the list of mailboxes can be empty, using the group construct
   is also a simple way to communicate to recipients that the message
   was sent to one or more named sets of recipients, without actually
   providing the individual mailbox address for any of those recipients.

Anybody can block mail for any reason they want ("my server, my rules"). But if 
they claim to do so with RFC justification for this case, then they're playing 
in the realm of "Alternative Facts"


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Problem scanning mails with Spam Assassin on Postfix

2010-08-27 Thread David B Funk
On Fri, 27 Aug 2010, Cimoni Enwis Ogwujiakwu wrote:

> Hello Micheal,
> But I am the ISP here. I provide internet access for subscribers and I have  
> redirected their smtp port 25 traffic through the smtp server, but the 
> response sent earlier when I want to connect as a test subscriber. which 
> forum can assist?
>  
> Thanks
>
> Cimoni Enwis Ogwujiakwu

If you are the ISP then you should have control over the networking. This
sounds like a problem with your networking or a problem with what ever
mechanism you are using to do that "I have redirected their smtp port"

First you should search out what ever support group there is for the
mechanism that you are using to do that smtp port redirection.
That is not a standard smtp server component and would need to be set up
carefuly, it sounds like it isn't working as expected.


-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: enabling SpamHaus DBL

2010-08-30 Thread David B Funk
On Tue, 31 Aug 2010, Mark Martinec wrote:

> Lawrence,
>
> > This is a dedicated server in a facility in the US. The server is
> > configured to use the resolvers 4.2.2.1 and 4.2.2.2
> >
> > I wouldn't dream of relying on Google for anything :)
>
> Like I said, your resolver is tricking you. Either by its
> own fault, or SpamHaus is intentionally not providing useful
> results to your DNS resplver:
>
> good (my own resolver):
> $ host -t a midpage.ru.dbl.spamhaus.org.
> midpage.ru.dbl.spamhaus.org has address 127.0.1.2
>
[snip..]
> bad:
> $ host -t a midpage.ru.dbl.spamhaus.org. 4.2.2.2
> Using domain server:
> Name: 4.2.2.2
> Address: 4.2.2.2#53
>
> bad:
> $ host -t a midpage.ru.dbl.spamhaus.org. 8.8.8.8
> Using domain server:
> Name: 8.8.8.8
> Address: 8.8.8.8#53
>
>
> There is no good reason to use ISP's or some public DNS resolver
> for anything but the smallest home network. Just install 'unbound',
> or 'bind' in resolving-only mode.
>
>   Mark

Mark is right.
Spamhaus has a policy of blocking any DNS server which makes "too many"
queries/day against their publicly available DNSBL lists. If you run a
"busy" mail system they want you to buy a data feed.
See: http://www.spamhaus.org/organization/dnsblusage.html

So by using some public/ISP's DNS server, your queries are getting
aggregated with everybody else using that DNS server and probably going
over the Spamhaus limit.

Run your own DNS server/resolver pointing directy to the spamhaus lists
and you won't have that problem. If they still block you then it will be
only your own use and you know that you'll have to spring for the paid
service.

BTW, even if you're below the Spamhaus 100k messages/day limit you can
still exceed the queries/day limit. SA makes multiple queries/message
and when combined with potential MTA queries can result in overload.

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: user_prefs questions/problem

2010-09-20 Thread David B Funk
On Mon, 20 Sep 2010, Chuck Campbell wrote:

> > enabled). Is SA integrated in your mail system in a way that it "knows"
>
> Not sure where to enable this.  Will dig more in the docs.
>
> > the user name of the recipient? (some integration methods do not make that
> > info avaialble to SA so the per-user prefs don't work).
> > Have you checked to make sure that your user_prefs are available/readable
> > to the SA daemon?
>
> How do I test this?

Assuming you're running spamd with standard logging enabled, look at the
spamd logs. You should see the username associated in each log entry. EG:

Sep 20 17:39:24 server33 spamd[20757]: spamd: connection from 
s-l104.engr.uiowa.edu [128.255.17.210] at port 36478
Sep 20 17:39:24 server33 spamd[20757]: spamd: checking message 
<20100920163923.oyqyrtl...@mx1.whitebeek.com> for astockda:115
Sep 20 17:39:25 server33 spamd[20757]: spamd: identified spam (29.1/6.0) for 
astockda:115 in 1.2 seconds, 19513 bytes.
Sep 20 17:39:25 server33 spamd[20757]: spamd: result: Y 29 - 
BAYES_99,COMBINED_FROM,FS_DEGREE,FVGT_m_MULTI_ODD2,HTML_90_100,HTML_MESSAGE,HTML_TAG_BALANCE_BODY,HTML_TINY_FONT,L_CLAMAV,MY_CLAMAV,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,SPF_PASS,T__BOTNET_NOTRUST,T__MY_CLAMAV,URIBL_BLACK,URIBL_JP_SURBL,URIBL_OB_SURBL,URIBL_WS_SURBL
scantime=1.2,size=19513,user=astockda,uid=115,required_score=6.0,rhost=s-l104.engr.uiowa.edu,raddr=128.255.17.210,rport=36478,mid=<20100920163923.oyqyrtl...@mx1.whitebeek.com>,bayes=1,autolearn=spam

That "for astockda" and "user=astockda" part is the username that spamd
received from the milter that I use to connect sendmail to spamd

To check user_prefs readablilty, do this:
1) on the machine running spamd (or what ever SA mechanism) login (or su)
   to the user in question.
2) create or obtain a test mail message, store in a text file.
3) run it thru spamassassin in debug mode:
 % spamassassin -D < test-message.txt > /tmp/test.out 2>&1

Then grep for 'user' in the output file:
 % grep user /tmp/test.out
[21751] dbg: config: using "/home/bill/.spamassassin" for user state dir
[21751] dbg: config: using "/home/bill/.spamassassin/user_prefs" for user prefs 
file
[21751] dbg: config: read file /home/bill/.spamassassin/user_prefs
[21751] dbg: Botnet: adding (\b|\d)user(\b|\d) to botnet_clientwords

Note the line that says "read file /home/bill/.spamassassin/user_prefs"
that file should exist and be readable by your spamd process AND be the
file that you've put the user config stuff in.

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Whitelist questions

2010-10-05 Thread David B Funk
On Tue, 5 Oct 2010, Joseph Brennan wrote:

>
> --On Tuesday, October 5, 2010 10:40 -0400 Alex 
> wrote:
>
> > I have an email that I'm trying to whitelist using whitelist_from_rcvd
> > and it's not working as I expect. I've created an entry:
> >
[snip..]
>
> Notice also that the rule checks the header From:, not the envelope,
> and they could be different.

When did that change?
Quoting from the docs for Mail::SpamAssassin::Conf it used to be:

   The headers checked for whitelist addresses are as follows: if "Resent-From" 
is set, use that; otherwise check all
   addresses taken from the following set of headers:

  Envelope-Sender
  Resent-Sender
  X-Envelope-From
  From

   In addition, the "envelope sender" data, taken from the SMTP envelope data 
where this is available, is looked up.

I distinctly remember setting up def_whitelist_from_rcvd entries to work
with envelope from addresses for "listwashing".


-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Question about Max msg size

2010-10-06 Thread David B Funk
On Wed, 6 Oct 2010, durwood wrote:

> I too am starting to see quite a bit of spam that's *just* over the 500k
> threshold due to ~4K-sized image attached to the spam. It almost makes me
> wonder if they are doing this just to get it over the standard SpamAssassin
> threshold.
>
> It seems like the size limit should be applied to the searchable parts of
> the email, not any attached images.

If you have 3rd party image extensions installed (EG imageinfo, FuzzyOCR)
then the image content -is- searchable and a valid part of spam filtering.

Given the processing demands of those modules you care very much about
the size of images attached to messages. With modern 10Megapixel cameras
it's not unusual to find 2~4Mbyte pictures attached to messages.

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Question about a spam assassin rule

2010-11-19 Thread David B Funk
On Fri, 19 Nov 2010, Daniel McDonald wrote:

> On 11/19/10 2:51 PM, "Bowie Bailey"  wrote:
>
> > rawbody  FR_3TAG_3TAG
> > m'<[abcefghijklmnoqstuvwxz]{3}>'i
> >
> > It looks for an html tag containing exactly three characters followed by
> > a closing tag which also contains exactly three characters.
>
> But no instances of d,p,r or y.  I'm sure that's a really clever trick for
> something, I just don't have a clue as to what it might be

It was an attempt to find obfsucated HTML junk that spamers were
using to break up spammy words such as "male medications"

EG: viagra

The idea was that most all legit 3 character HTML tags such as ''
contained at least one of those letters ([dpry]) in them. So a purported
tag that had none of them was not legit and thus probably bogus spammer
spoor.
With the evolution of HTML (xml, etc) that's no longer a safe
asumption, so that rule probably FPs.


-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: HELO_DYNAMIC false positives on a UK web host

2010-12-09 Thread David B Funk
On Thu, 9 Dec 2010, Karsten Bräckelmann wrote:

> On Thu, 2010-12-09 at 14:43 -0800, John Hardin wrote:
> > > It appears that a client can easily set up hosting using cPanel or
> > > something without ever setting the rDNS or hostname to anything other
> > > than the numeric default.
> >
> > Is there anything in the headers that indicates cpanel is in use? Perhaps
> > a meta on cpanel
>
> Proof a mail system has been set up and is being maintained by clicking
> through a simple UI system. Strong hint the operator doesn't know much
> about such systems, and likely not about properly securing auth either.
>
> > + dynamic-looking-rDNS would be worth a negative point or two...
>
> Plus proof the operator indeed doesn't know, or doesn't care. You think
> that's worth a negative score?
>

Maybe not a true negative score but null out the HELO_DYNAMIC rules
score penalty. IE if it's running cpanel then strong probability that
it has a static IP address. (what's the point of running a server
with a dynamic address.)

The poor operator may be totally clueless about how his actual IP address
appears on the net.
he's some smuck who bought a cheap hosting service for his business and
just did the point-and-click monkey dance to get his store on-line.

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: DNSBL for email addresses?

2010-12-14 Thread David B Funk
On Tue, 14 Dec 2010, Marc Perkel wrote:

> Are there any DNSBLs out there based on email addresses? Since you can't
> use an @ in a DNS lookup - how would you do DNSBL on email addresses? Is
> there a standard?
>

Why do you say "Since you can't use an @ in a DNS lookup"??
Unless you're using obsolete software there's no reason you cannot.

EG:

 % nslookup 'a...@khath.com.phish.icaen.uiowa.edu'
 Server:  dns2.icaen.uiowa.edu
 Address:  128.255.17.20

 Name:a...@khath.com.phish.icaen.uiowa.edu
 Address:  127.0.0.2

 % nslookup 'a...@khath.com.phish.icaen.uiowa.edu'
 Server:  dns2.icaen.uiowa.edu
 Address:  128.255.17.20

 *** dns2.icaen.uiowa.edu can't find a...@khath.com.phish.icaen.uiowa.edu:
 Non-existent host/domain

and that's with bind-9.4, not a particularly new revision.

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Training Bayes on outbound mail

2011-01-28 Thread David B Funk
On Fri, 28 Jan 2011, David F. Skoll wrote:

> On Fri, 28 Jan 2011 18:10:08 +
> Dominic Benson  wrote:
>
> > Recently, in order to balance the ham/spam ratio given to sa-learn, I
> > have started to pass mail submitted by authenticated users to
> > sa-learn --ham.
>
> > I haven't seen any mention of this strategy on-list or on the web, so
> > I'm interested in whether (a) anyone else does this, and (b) is there
> > a good reason not to do it that I haven't thought of?
>
> It's possibly a good idea, but you want to be really careful of one
> thing: Make sure your users are savvy enough not to have their
> accounts phished.  It'll take just one compromised account that blasts
> out a spam run to destroy the usefulness of your Bayes data.

Amen to that. Sad how many supposedly educated people (say engineering
professors ;) fall for phishes and get their accounts powned. 419 spammers
love to target university systems, semi-clueless users and fat pipes.

One other semi-issue with that strategy, half of Bayes is based upon
header contents. Your outgoing messages are not going to have headers that
are representative of incoming messages.

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: alert: New event: ET EXPLOIT Possible SpamAssassin Milter Plugin Remote Arbitrary Command Injection Attempt

2011-02-10 Thread David B Funk
On Fri, 11 Feb 2011, Jason Haar wrote:

> On 02/11/2011 09:37 AM, Mark Martinec wrote:
> > Yes, the security hole is entirely within the milter,
> > independent of the MTA.
> >
> That exploit is dated Mar 2010? Has this really not been fixed in about
> a year???
>
>

"a year"??, try half-a-decade. I've got a copy of that code from March
2006 and the vulnerability is there. Rather stale project. ;)


-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


  1   2   3   4   5   6   >