Bill Cole <sausers-20150...@billmail.scconsult.com> writes: > It would generally be a bad idea to increase the Postfix timeout, as > that passes the problem back upstream as senders will generally time > out at 300s as well. > > So, add '--timeout-child=295' to your spamd arguments if you want to > make spamd timeout faster than Postfix reliably.
Thanks; I didn't think of the previous timeout. Before getting your mail, I did set my postfix milter timeout to 330s, but the actual delay was ~301s since the spamd timeout worked. That resulted in delivery and the remote system (also postfix) not giving up. I have since changed to --timeout-child=290 in spamd and restored postfix to default. >> need to figure out why there is a timeout > > That's the important part. I am narrowing the circumstances and will follow up when I figure it out. >> The first is surely manual reading, but I wonder why it isn't default. > > We don't try very hard to guess what users will want in the > integration details between SA and the tools like MTAs that use > it. 300s is the SMTP default timeout at end-of-data, which presumably > is why it is spamd's default. I think it makes sense to reduce that > for most circumstances, but I'm a bit hesitant to do so in the > distribution because there could be people relying on the specific > idiosyncratic behavior of spamd timing out after its caller has given > up rather than before. It strikes me that timeouts happening is basically a symptom of bugs and each layer should be set up to avoid being non-responsive to the calling layer. While I see your point about not tuning for what people might want, it seems that if a system is to meet the 300s SMTP data timeout, spamd needs to take less than 300s, so going for 290 or 295 seems sensible. I would guess, without any real basis, that far more people are just sitting on latent trouble than really intend to have a milter callout give up about a second just before spamd. > The most common reason for SA to hit its internal timeout is the > combination of a rule with a pattern that can generate a large number > of backtracks while scanning (exponential or factorial order) and a > message which causes such backtracking. Typically that's caused by a > '*' or '+' in a pattern where a fixed range for the number of repeats > should be used instead. A few years ago we tried to fix all cases of > dangerous rules in the default ruleset, and I think we succeeded. I > believe the KAM rules have also been audited for likely problems. If > you have any unbounded wildcards in your local rules, tightening those > rules up should be your first step. If you can't find and fix the > problematic rule by eye, you can get clues about it by scanning a > problematic message with the "-D all" option to get a detailed rundown > of what SA does in scanning a message. That will show you what rules > are checked successfully. You can find a problematic rule by comparing > that debug output from a bad message to that of a message which > doesn't hang SA. Thanks, that regexp hint is a huge clue to me.
signature.asc
Description: PGP signature