On 2021-11-02 at 19:15:33 UTC-0400 (Tue, 02 Nov 2021 19:15:33 -0400)
Greg Troxel <g...@lexort.com>
is rumored to have said:
I have a systeem with postfix and spamassassin 3.4.6 via spamd. It's
been generally running well. I noticed mail from one of my other
systems timing out and 471, and that caused me to look at the logs. I
have KAM rules, some RBL adjustments, a bunch of local rules for my
spam, but really nothing I consider unusual.
[...]
and thus I have two problems:
need to have postfix delay be more than spamassassin delay plus
rounding
It would generally be a bad idea to increase the Postfix timeout, as
that passes the problem back upstream as senders will generally time out
at 300s as well.
So, add '--timeout-child=295' to your spamd arguments if you want to
make spamd timeout faster than Postfix reliably.
need to figure out why there is a timeout
That's the important part.
The first is surely manual reading, but I wonder why it isn't default.
We don't try very hard to guess what users will want in the integration
details between SA and the tools like MTAs that use it. 300s is the SMTP
default timeout at end-of-data, which presumably is why it is spamd's
default. I think it makes sense to reduce that for most circumstances,
but I'm a bit hesitant to do so in the distribution because there could
be people relying on the specific idiosyncratic behavior of spamd timing
out after its caller has given up rather than before.
On the second, I wonder if anyone else is seeing this, and clues
appreciated.
I have no recent SA timeouts logged recently on any of the systems I
manage.
The most common reason for SA to hit its internal timeout is the
combination of a rule with a pattern that can generate a large number of
backtracks while scanning (exponential or factorial order) and a message
which causes such backtracking. Typically that's caused by a '*' or '+'
in a pattern where a fixed range for the number of repeats should be
used instead. A few years ago we tried to fix all cases of dangerous
rules in the default ruleset, and I think we succeeded. I believe the
KAM rules have also been audited for likely problems. If you have any
unbounded wildcards in your local rules, tightening those rules up
should be your first step. If you can't find and fix the problematic
rule by eye, you can get clues about it by scanning a problematic
message with the "-D all" option to get a detailed rundown of what SA
does in scanning a message. That will show you what rules are checked
successfully. You can find a problematic rule by comparing that debug
output from a bad message to that of a message which doesn't hang SA.
--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire