On Wed, Oct 07, 2020 at 10:35:39PM +0000, Pau Peris wrote: > Could you explain to me which would be the benefits of implementing > such behaviour on a filter or milter instead of doing it on > header_checks?
As I wrote upthread, and you quoted in your message: > > RFC5322.From syntax is rather non-trivial, and trying to parse it with > > regular expressions is not a terribly good idea. While most addresses > > are simple, and you might not ever see the exceptions, I do not > > recommend ad-hoc half-right parsers for the mailbox syntax. It is non-trivial to craft robust regular expressions for RFC*22 mailbox syntax, not quite as bad as: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 but naïve attempts are likely to fall short of the full grammar. It might be simpler to arrange for multi-recipient messages to the purported author of the message to be dropped, by passing mail submission from the Web form through an SMTP service that rejects all multi-recipient mail (and making sure that the envelope is not split before that happens). On the other hand, for a web contact form, if you want to only permit a single localpart@domain format, rather than any of the more general phrase <mailbox> "quoted-text" <mailbox> mailbox (comment) ... variants, then a regular expression becomes somewhat simpler, until you also need to handle EAI (non-ASCII localpart and/or domain), e.g. виктор1spam@духовный.org the possible forms are then: - dot-atom@domain - quoted-string@domain Where the first variant is matched by: # PCRE: ASCII dot-atom @ domain /^ (?: [^][()<>:;@\\,."\x00-\x20\x7f-\xff]+ \.)? [^][()<>:;@\\,."\x00-\x20\x7f-\xff]+ @ (?: [a-z\d]+ (-+[a-z\d]+)* \.)+ [a-z\d]+ (-+[a-z\d]+)* /x DUNNO # PCRE: quoted-string sans NUL @ domain /^ " ( [^\\"\x00]+ | \\[^\x00] )+ " @ (?: [a-z\d]+ (-+[a-z\d]+)* \.)+ [a-z\d]+ (-+[a-z\d]+)* /x DUNNO # Not a valid address /^/ whatever action is appropriate You may want to replace /^/ with /^From:\s*/ if this is header checks. Postfix does not currently support matching unicode with PCRE, so validating EAI addresses with pcre_table(5) may not yet be possible. > Also, do you know in which cases would be useful to allow or make use > of multiple From addresses? Just in case I'm missing something. > > Thanks in advanced, > > On Tue, Oct 6, 2020 at 10:50 PM Viktor Dukhovni > <postfix-us...@dukhovni.org> wrote: > > > > On Wed, Oct 07, 2020 at 12:27:09AM +0000, Pau Peris wrote: > > > > > I'm hosting my dad's webpage which has a contact form (which should be > > > improved to avoid spam and/or bots) and from time to time someone > > > types multiple email addresses in the from field of the form so > > > contact emails with multiple from addresses like "from: > > > h...@example.com, f...@example.net" are generated. I though that those > > > kind of messages should get rejected and thought that maybe there was > > > a builtin restriction for this use case. > > > > Therefore, the right solution would be in a content filter or milter, > > coupled with a solid email address (list) parsing library. -- Viktor.