On 2021-11-02 at 19:15:33 UTC-0400 (Tue, 02 Nov 2021 19:15:33 -0400)
Greg Troxel <g...@lexort.com>
is rumored to have said:

I have a systeem with postfix and spamassassin 3.4.6 via spamd.  It's
been generally running well.  I noticed mail from one of my other
systems timing out and 471, and that caused me to look at the logs.  I
have KAM rules, some RBL adjustments, a bunch of local rules for my
spam, but really nothing I consider unusual.
[...]
and thus I have two problems:

need to have postfix delay be more than spamassassin delay plus rounding

It would generally be a bad idea to increase the Postfix timeout, as that passes the problem back upstream as senders will generally time out at 300s as well.

So, add '--timeout-child=295' to your spamd arguments if you want to make spamd timeout faster than Postfix reliably.

  need to figure out why there is a timeout

That's the important part.

The first is surely manual reading, but I wonder why it isn't default.

We don't try very hard to guess what users will want in the integration details between SA and the tools like MTAs that use it. 300s is the SMTP default timeout at end-of-data, which presumably is why it is spamd's default. I think it makes sense to reduce that for most circumstances, but I'm a bit hesitant to do so in the distribution because there could be people relying on the specific idiosyncratic behavior of spamd timing out after its caller has given up rather than before.

On the second, I wonder if anyone else is seeing this, and clues appreciated.

I have no recent SA timeouts logged recently on any of the systems I manage.

The most common reason for SA to hit its internal timeout is the combination of a rule with a pattern that can generate a large number of backtracks while scanning (exponential or factorial order) and a message which causes such backtracking. Typically that's caused by a '*' or '+' in a pattern where a fixed range for the number of repeats should be used instead. A few years ago we tried to fix all cases of dangerous rules in the default ruleset, and I think we succeeded. I believe the KAM rules have also been audited for likely problems. If you have any unbounded wildcards in your local rules, tightening those rules up should be your first step. If you can't find and fix the problematic rule by eye, you can get clues about it by scanning a problematic message with the "-D all" option to get a detailed rundown of what SA does in scanning a message. That will show you what rules are checked successfully. You can find a problematic rule by comparing that debug output from a bad message to that of a message which doesn't hang SA.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire

Reply via email to