On Sat, May 21, 2022 at 08:37:06AM -0400, Viktor Dukhovni wrote:
> On Fri, May 20, 2022 at 03:54:36PM -0500, Bryan K. Walton wrote:
> 
> The "Return-Path" header is added during final message *delivery*, after
> the message enters the queue, and is almost universally absent at the
> SMTP stage.  Any header checks on "Return-Path" are pointless.
> 
> Instead, use "check_sender_access", since the content of the
> Return-Path header added during final delivery is the envelope
> sender address.

Thanks for the reply, Viktor.  I apologize for the lateness of mine.
We will start using check_sender_access for checks against the envelope
sender address.

> > ù, ǔ, ɫ, ɇ, etc.
> 
> If you haven't enabled SMTPUTF8 support, the "From" header should not
> have such characters present, they're instead encoded quoted-printable
> or base64 via RFC2047.
> 
> Also regular expressions are a rather poor tool for parsing email
> addresses.  You're turning screws with a hammer.  You should probably
> rethink your goals.

SMTPUTF8 is enabled.

What we are trying to do is create a list of email domains that might
look similar to our own domains and to block them during the SMTP
session, on the assumption that they are being used in a phishing
attempt and are attempting to fool our recipients.  We have used the
dnstwist code to generate a list of these domains.

For example, one of our domain names is courseleaf.com.  We want to
block any mail that has similar domain names in the From header.  An
example might be: coǔrṣeleaf.com

We currently have smtputf8 enabled, what can we use to block email with
coǔrṣeleaf.com in the From: header?

> > I've read this page:
> > https://www.postfix.org/SMTPUTF8_README.html and I understand that
> > header checks are not UTF-8 enabled.  My understanding of that page is
> > that I must add *UTF8 to the beginning of the PCRE pattern.  I'm a
> > little unclear about what the pattern would look like.
> 
> No.  The correct interpretation is that expecting valid UTF8 syntax is
> not realistic, and that you'd end up rejecting messages you'd want to
> accept if you did that.  You should therefore NOT add that prefix.

Very good to know.  Thank you for correcting my misunderstanding.

> Are you sure that's actually the domain in the From header?  It could
> well be in A-label form: xn--1105i-yzaa.com
> 
> You could also (without enabling UTF8 RE syntax) check for the
> underlying raw octets of the UTF-8 encoding of "ĕ".  All you
> need to do for that is edit the regexp/pcre table with a UTF-8
> enabled editor, and type a literal "ĕ" into the pattern.
> 
>     $ echo ĕĕ | (LANG=C LC_CTYPE=C LC_ALL=C egrep ĕĕ)
>     ĕĕ

Sorry, I'm not understanding this.  I've tried putting into my header
checks:

coǔrṣeleaf.com
co=C7=94r=E1=B9=A3eleaf.com (quoted printable)
co0x01D4r0x1E63eleaf.com (unicode converted to ascii)

All of these let pass coǔrṣeleaf.com in the From header.  What am I
missing?

> UTF-8 encoded patterns match UTF-8 encoded input.  The only
> reason to use explicit UTF-8 in regular expressions is to use
> fancy Unicode features (character classes) in the pattern.

Very good to know.  Thanks.

Thanks,
Bryan

-- 
Bryan K. Walton                                           319-337-3877 
Linux Systems Administrator                 Leepfrog Technologies, Inc 

Reply via email to