On 19Nov2016 13:13, Kevin J. McCarthy <ke...@8t8.us> wrote:
On #mutt, andrey_utkin_ reported getting a bounce trying to reply to a
linux-kernel mailing list email.  When he replied, vger.kernel.org
bounced it because of raw utf-8 in a header.

He posted a gist at
<https://gist.github.com/andrey-utkin/b204666c34613858a34844283571ce17>

I don't know how long those hang around, but the problem is in the
References header: <201611170549.Q3WTfoMBþngguang...@intel.com>
contains the utf-8 character "þ".

Are any of you familiar with the rules for Mesage-ID and References
headers?

Somewhat. Reviewing RFC 5322 right now to see how dated my knowledge is ...

 https://tools.ietf.org/rfcmarkup/5322

Should mutt be rfc-2047 encoding/decoding the references
header?

No. RFC2047 tokens need to be whitespace delimited from the surrounding text. No whitespace is permitted inside the "<" and ">" markers which enclose a message-id:

 https://tools.ietf.org/rfcmarkup/5322#section-3.6.4

The whitespace padding requirement is discussed in RFC2047 section 5:

 https://tools.ietf.org/rfcmarkup?doc=2047#section-5

The RFC5322 message-id syntax prevents using RFC2047.

I think the cited message-id is simply illegal and unfixable. Mutt should perhaps support it for stitching threads together, but arguably _not_ release such a thing into the wild in new References: or In-Reply-To: headers.

What about the domain part - should we be idn encoding that
part if $idn_encode is set?

Perhaps, if required. It looks like RFC3490's encoding is legal dot-text for RFC5322 (based on my reading of the Wikipedia article). The RFC is here:

 https://tools.ietf.org/rfcmarkup?doc=3490

and the article I've consulted has the relevant section here:

 
https://en.wikipedia.org/wiki/Internationalized_domain_name#ToASCII_and_ToUnicode

Cheers,
Cameron Simpson <c...@zip.com.au>

Reply via email to