On Wed, Oct 07, 2020 at 10:35:39PM +0000, Pau Peris wrote:

> Could you explain to me which would be the benefits of implementing
> such behaviour on a filter or milter instead of doing it on
> header_checks?

As I wrote upthread, and you quoted in your message:

> > RFC5322.From syntax is rather non-trivial, and trying to parse it with
> > regular expressions is not a terribly good idea.  While most addresses
> > are simple, and you might not ever see the exceptions, I do not
> > recommend ad-hoc half-right parsers for the mailbox syntax.

It is non-trivial to craft robust regular expressions for RFC*22 mailbox
syntax, not quite as bad as:

    
https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

but naïve attempts are likely to fall short of the full grammar.  It
might be simpler to arrange for multi-recipient messages to the
purported author of the message to be dropped, by passing mail
submission from the Web form through an SMTP service that rejects
all multi-recipient mail (and making sure that the envelope is
not split before that happens).

On the other hand, for a web contact form, if you want to only permit
a single

    localpart@domain

format, rather than any of the more general

    phrase <mailbox>
    "quoted-text" <mailbox>
    mailbox (comment)
    ...

variants, then a regular expression becomes somewhat simpler, until
you also need to handle EAI (non-ASCII localpart and/or domain), e.g.

    виктор1spam@духовный.org

the possible forms are then:

    - dot-atom@domain
    - quoted-string@domain

Where the first variant is matched by:

    # PCRE: ASCII dot-atom @ domain
    /^ (?: [^][()<>:;@\\,."\x00-\x20\x7f-\xff]+ \.)? 
[^][()<>:;@\\,."\x00-\x20\x7f-\xff]+ @ (?: [a-z\d]+ (-+[a-z\d]+)* \.)+ [a-z\d]+ 
(-+[a-z\d]+)* /x  DUNNO

    # PCRE: quoted-string sans NUL @ domain
    /^ " ( [^\\"\x00]+ | \\[^\x00] )+ " @ (?: [a-z\d]+ (-+[a-z\d]+)* \.)+ 
[a-z\d]+ (-+[a-z\d]+)* /x  DUNNO

    # Not a valid address
    /^/     whatever action is appropriate

You may want to replace /^/ with /^From:\s*/ if this is header checks.

Postfix does not currently support matching unicode with PCRE, so
validating EAI addresses with pcre_table(5) may not yet be possible.

> Also, do you know in which cases would be useful to allow or make use
> of multiple From addresses? Just in case I'm missing something.
> 
> Thanks in advanced,
> 
> On Tue, Oct 6, 2020 at 10:50 PM Viktor Dukhovni
> <postfix-us...@dukhovni.org> wrote:
> >
> > On Wed, Oct 07, 2020 at 12:27:09AM +0000, Pau Peris wrote:
> >
> > > I'm hosting my dad's webpage which has a contact form (which should be
> > > improved to avoid spam and/or bots) and from time to time someone
> > > types multiple email addresses in the from field of the form so
> > > contact emails with multiple from addresses like "from:
> > > h...@example.com, f...@example.net" are generated. I though that those
> > > kind of messages should get rejected and thought that maybe there was
> > > a builtin restriction for this use case.
> >
> > Therefore, the right solution would be in a content filter or milter,
> > coupled with a solid email address (list) parsing library.

--
    Viktor.

Reply via email to