On 8.11.2012, at 4.57, Christoph Anton Mitterer wrote: > On Wed, 2012-11-07 at 17:33 +0200, Timo Sirainen wrote: >> Dovecot automatically adds CRs where necessary. Even within the same file >> there can be mixed LF/CRLF lines. > Can you detail this a bit, or point me to the specific code areas? > > 1) Is only CR added? Or also LF?
If CR is alone, it's not treated as newline. So only CRs may be added before LF. > 2) What happens e.g. when LFCR is found? Is that then "doubled" to > CRLFCR or even CRLFCRLF? CRLFCR > 3) When does it "add" these chars? Only when using dovecot-lda? Or also > when some other MDA places files into e.g. a maildir? When saving a mail, based on mail_save_crlf setting the CRs are either added or removed when writing the mail to disk. When reading a mail and sending to IMAP/POP3 client the CRs are always added. (doveadm fetch text doesn't add/remove CRs I think.) > I did some reading on the RFC 5322 which says: > > - new mails must not have single CR or LF, both may only occur as CRL > > - but from the previous RFCs, it allows existing messages to have CR and > LF alone, in which case they are not newlines as CRLF, but rather the CR > and LF characters in the their meaning as control characters. > > > 4) So from that point of view... automatic conversion may actually > "corrupt" things in a strict sense. > (One should hope of course, that only few people use(d) CR or LF alone > to get their control character meaning... but rather that these are just > cases of accidents.) SMTP and IMAP protocols are the only normal ways to get messages into a system. Both of them require CRLF newlines. So there's really no way for Dovecot to ever see valid LF-only newlines. One exception is Content-Type: binary, but that's not really supported by Dovecot (or any commonly used SMTP servers either I think). > 5) I agree with you that mails should be stored with CRLF, as this is > their native format.... and I found nothing on the maildir[++] standards > that would forbid that (neither that would encourage it). > But for mbox there are "definitions" that _always_ LF is used (AFAIU, > even on non-UNIX platforms. mbox isn't really standardized. Anyway, storing mails with CRLF allows some optimizations, but if the mails aren't stored compressed it wastes a bit of disk space. > 6) I went through my mails and basically I found everything: > CR, LF, CRLF and even LFCR. > Now I have no real idea how to deal with that? > Keep all as is? Make all LFs CRLFs and/or all CFs to CRLFs? What about > the LFCRs? Handle them as group and perhaps swap them to CRLF. Or doing > the same as with single LFs and CRs. Why do you need to do something about them? Dovecot should handle all of them fine.