On 06/06/2024 03:44, Viktor Dukhovni via mailop wrote:
On Wed, Jun 05, 2024 at 05:29:16PM +0100, Vsevolod Stakhov via mailop wrote:
In fact, the original distinction between structured and unstructured
headers defined in the RFC2047 just makes parsing extremely complicated and
I personally consider it as an example of a standard being accepted with a
clear violation of KISS principle for no good reason.
The distinction is essential, because it would be a terrible mistake to,
for example, RFC2047-encode the "mailbox" construct in "From", "To", ...
headers. An RFC2047-ignorant MUA or MTA can still correctly decode the
addresses in those headers without caring about the display name
encoding.
Unfortunately, SMTPUTF8 makes it even worse as instead of following
something that works (e.g. punycode) it creates a completely different state
machine for parsing messages otherwise indistinguishable from generic ASCII
compatible emails.
It seems to me, that you may not have thought through the issues deeply
enough.
As Rspamd author, I will not change the existing logic, as it works with
headers as with black boxes making the following steps: unfold -> rfc2047
decode -> process specific data.
This, IMNSHO, is not a reasonable stance to take...
Such willful disregard of essential interoperability requirements in
"rspamd" means I will not use it unless you back off from your current
position, and will strongly discourage others (e.g. postfix-users list
readers) from using it. I've heard "rspamd" otherwise has some
appealing features, but this is show-stopper. :-(
I'm sorry but I do not accept the interoperability argument in this
context. Rspamd is not an MTA - it is a spam filtering system. Hence, it
has to work *somehow* with broken and non-conformant emails. I've seen
so many messages with bad mime structure, bad headers encoding, broken
received traces etc. And in all the cases MUAs were able somehow to show
that apparent spam/phishing to the end users. For example, one spam
campaign has used '\0' character in messages, as this character is
silently removed and ignored by Microsoft Outlook.
What I wanted to say by this message is that complications in the
standards lead to ambiguity in the implementations (as they , which, in
turn, lead to the possibilities for spammers to exploit those
implementation issues. I'm not even talking about the standard that are
ambiguous by design, e.g. DKIM simple canonicalization.
_______________________________________________
mailop mailing list
mailop@mailop.org
https://list.mailop.org/listinfo/mailop