On 23 Dec 2015, at 5:26, sb wrote:
Re: body_checks(5)
> The input string for body_checks is a single message body line.
> so "\A" == "^" and "\z" == "$".
Both /m and /A are compliant with postfix's pcre_table(5). Therefore,
\A and \z *must not* fall back to ^ and $ when using /Am.
That apparent "logic" is not what it is pretending to be.
The fact that the pcre subject string does not actually contain multiple
lines but rather only contains one line (as documented for body_checks)
has no effect on how Postfix handles the /A and /m modifiers or the
logical semantics of the \A and \z anchors.
Viktor's statement was not about Postfix-specific logic but about the
degeneracy of pcre anchor semantics when operating in multiline mode on
a single-line subject string. See the pcreapi(3) man page for the full
details, but in short: subject anchors and line anchors match precisely
the same way if a subject is exactly one line and isn't marked as being
a partial line.
Postfix's adaptation of pcre
This is not an "adaptation" it is simply "use." The subject string
provided to pcre by body_checks is, AS CLEARLY DOCUMENTED, always
exactly one raw undecoded body line from the message. The Postfix
modifiers are documented to set various option flags when calling pcre,
and they do so.
leads to false positives,
as stated in the OP.
There was no false positive. You expected body_checks to function in a
way that its documentation clearly and explicitly states it DOES NOT
function. Instead, body_checks behaved as it is designed and documented
to behave.
> Postfix does not read the entire message into memory.
> That does not scale well.
Yes, this confirms the OP.
I suggest to either allow postfix to read the entire message body,
except the attachments,
I hope Dr. Venema does not follow that suggestion. I expect he will not,
because the lightweight limited functionality of the header and body
checks internal to Postfix is sufficient for some very simple content
inspection but is limited in scope to keep Postfix itself from being a
resource hog or a hiding place for subtle vulnerabilities to crafted
email input. Keeping the scope of Postfix's internal content inspection
tightly limited while providing diverse ways to hook into external tools
has enabled a rich environment for more versatile content filtering
tools like Amavis and MIMEDefang to flourish.
Also, it seems unwise to ask a MTA to define what constitutes an "entire
message body, except the attachments" in a world where
multipart/alternative and multipart/related and message/rfc822 are
defined types used in real messages. MUAs don't all see "body" vs.
"attachment" the same way, and we hardly need Yet Another View by a MTA
that is largely invisible.
Finally: see Postfix's BUILTIN_FILTER_README for why the internal
filtering functions are so lightweight and limited *by design*.
or deprecate postfix's /Am altogether.
The /A modifier is useful and functional in Postfix pcre maps. I see no
reason to deprecate it or to change its documentation in the pcre_table
man page, as it seems perfectly accurate.
The /m modifier may in fact never be useful in a Postfix pcre maps and
I'm not sure how one would test for whether it is actually functional. I
cannot think of any circumstance where Postfix might want to pass pcre a
truly multiline subject string for a map query, but such a case may
exist. If it does not, perhaps the documentation should make clear that
setting PCRE_MULTILINE isn't actually useful for its primary effect and
will only provide whatever subtle semantic side-effects exist in pcre
when that option is set. Since anyone reading the body_checks man page
carefully knows the subject string never has embedded newlines, there's
not really any reason to deprecate /m. It's not dangerous.