On Thu, Jun 21, 2018 at 10:42:58AM +1000, m...@raf.org wrote: > Ian Zimmerman wrote: > > My guess is that mutt looks at the locale environment (LANG and LC_*) to > > set the encoding of the source data, and tries to recode it into one > > of the encodings in send_charset. > > > > If you _know_ your data is iso-8859-1 but your LANG etc. is something > > else, try changing LANG locally in your driver script/program. > > Thanks. I'll try that. Mutt probably detects that it isn't valid utf-8, > and so doesn't match the system locale, and so can't be converted. > That would make sense.
Yes. Mutt assumes that its input is correctly encoded according to the system's configured locale. The conversion you're not clear about is that when it sends a message, it tries to convert from your configured locale, whatever it is, to each of the character sets in the send_charset variable, in order, until it finds one for which the conversion does not fail. It sends that converted version. The default setting therefore should usually guarantee, for all English-speaking users and many non-English-speaking European users, that the outgoing e-mail will be encoded in the simplest encoding possible, given its contents; i.e. it will send US-ASCII if it can, then iso-8859-1, unless full Unicode support is required to represent the data. Your problem most likely is exactly what you guessed: Your locale is UTF-8, but the data is not valid Unicode, so all conversions failed. Mutt just sends the bytes you fed it, and it appears in your case it failed to default to a reasonable character set (rather than $charset), for whatever reason related to your combination of locale and Mutt settings. > I wonder if setting charset to iso-8859-1 would also fix it. This might work, but it's not the "right" fix... The right fix is to make sure that the data you're sending actually matches your system's locale settings. Note that given the default send_charset settings, it should be possible for you to actually use UTF-8, and have mutt convert the e-mail to iso-8859-1, if you really want that for some reason, since it will try to use the first matching charset in your send_charset to which it can successfully convert the data, as I described above. I used to do this for Korean that I drafted in UTF-8, since at the time a lot of Koreans still had systems (Win98) that only supported EUC-KR (WinXP had been out for years, but some people are extremely slow to update their systems)... But if you're using Unicode locally, why not just send it as UTF-8 and be done with it? These days it should be just about impossible to find people using e-mail on systems that can't handle Unicode. And if the only reason is that the data is already in ISO-8859-1 and you don't know how to convert it, that's easy to fix: Just use the iconv command (iconv is both a C library and a system command). You can just convert it from iso-8859-1 to en_AU.UTF-8 once and stop mucking with incompatible charater set settings. See the man page for details, but it's pretty simple... you just specify the input character set and the output character set. > That's for the terminal so it's probably not wise to change that. > Maybe assumed_charset? (maybe that's only for incoming messages). You should really never set charset explicitly. If your system is configured properly, there's virtually never any need to do it, as Mutt will correctly use your system's locale settings, which would be the preferred way to make sure things are set up correctly. The main exception is if you have a large pile of pre-existing data that's in some other charset besides the one you use, which you'll use in some fashion other than typing it in manually, and converting it would be prohibitively costly. -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02 -=-=-=-=- This message is posted from an invalid address. Replying to it will result in undeliverable mail due to spam prevention. Sorry for the inconvenience.
pgpVfD2Vbz_lP.pgp
Description: PGP signature