(Subject line altered - original was confused) On Friday 30 September 2011 00:07:08 Michael M Slusarz wrote: > Quoting Andrew Richards <ar-dovecotl...@acrconsulting.co.uk>: > > Hi, > > > > I've noticed a possible minor issue with long encoded filenames for > > attachments > > where these filenames are split across multiple lines. My understanding > > of character encoding and MIME is not as good as it should be, so I may > > easily have got this all mixed up, in which case sorry for the noise... > > > > Although I understand the preferred method for handling filenames > > split across multiple lines (because they're too long to fit on one line > > in the message) is that suggested in RFC2184/2231, so for example, > > filename*0*=iso-8859-1''accented_characters_here_%EA%CA%E6 > > filename*1=etc%2Epdf > > > > I find that some mail clients do this instead, > > filename="=?ISO-8859-1?Q?accented_characters_here_=EA=CA=E6?= > > =?ISO-8859-1?Q?etc=2Epdf?=" > > > > In Dovecot this results in, > > 0 fetch 25 body > > * 25 FETCH (BODY (("text" "plain" ("charset" "ISO-8859-1") NIL NIL "7bit" > > 239 8)("application" "pdf" ("name" > > "=?ISO-8859-1?Q?accented_characters_here_=EA=CA=E6?= > > =?ISO-8859-1?Q?etc=2Epdf?=") NIL NIL "base64" 219130) "mixed")) > > > > esp. note the unwanted space - or in fact the sequence ?= =? between the > > two sections of the filename. I think a possible tweak for Dovecot would > > be to combine the filename parts in this situation to remove the ?= =?.
Correcting myself: ...remove the ?= =?ISO-8859-1?Q? (not just ?= =?) to generate the string in this example, "=?ISO-8859-1?Q?accented_characters_here_=EA=CA=E6etc=2Epdf?=" > > I'm not sure > > if an IMAP client should know to combine the parts in their current > > format. FWIW I see that Courier does the same as Dovecot in this > > situation. > > Dovecot's behavior is correct. There's nothing "special" about that > name parameter - it's not RFC 2231 encoded - so the IMAP server should > output the exact header text as-is. Those two parts were separated by > space in the original header - they should be separated by space when > grabbing the fetch data. I can accept that Dovecot's behaviour is technically correct, but my point is that (if I've understood correctly) with some large mailers like Gmail acting in a non-RFC2231 manner, is it worth adapting Dovecot to play nicely with them. Possibly I'm conflating 2 separate issues: Munging together non-RFC2231 attachment filename parts, large mailers not using RFC2231 to handle long non- ASCII filenames. > If the *client* wants to workaround these broken messages, it can do > whatever munging is wants to translate the contents of the "name" > parameter. But that should be up to the client. An IMAP server > should not be making wild assumptions about what the original sender > wanted to do with the message vs. what it actually sent. > > FYI: A workaround is to do something like this when sending a message: > > Content-Dispostion: attachment; > filename="=?ISO-8859-1?Q?accented_characters_here_=EA=CA=E6?= > =?ISO-8859-1?Q?etc=2Epdf?="; > filename*0*=iso-8859-1''accented_characters_here_%EA%CA%E6; > filename*1=etc%2Epdf Sure: I accept that that's the preferred way to handle long filenames that need to be encoded - but I'm noting that there are badly-behaved large mailers that don't do so, so I wonder if it's worth Dovecot mitigating the effects. Best regards, Andrew.