Re: Charset issue?
On Saturday, May 12, 2007 at 14:37:14 +1200, Roland Hill wrote: > $ echo $LANG > en_NZ.UTF-8 OK: For now, keep this one. And don't set $charset in muttrc: It is supposed to take the good value automagically. Type ":set ?charset" directly in Mutt to check it says charset="utf-8". Then take a look at the garbled messages. > LANGUAGE=en_NZ:en That is a valid value: A colon-separated list of locales in prefered order for messages. A sort-of super-LC_MESSAGES. However your setting is not very interesting, so I'd tend to advice to unset it, and wipe it from whatever startup file(s) do set it. But you can do as you like. >>| $ printf "L1: won\0264t \0250reply\0250\nU8: won\0302\0264t >>\0302\0250reply\0302\0250\n" > A picture speaks a thousand words? Et merde! Sorry Roland, my mistake: I managed to give you a syntax that your printf didn't like. Please excuse me, and run this instead: | $ printf "L1: won\xB4t \xA8reply\xA8\nU8: won\xC2\xB4t \xC2\xA8reply\xC2\xA8\n" On Saturday, May 12, 2007 at 16:44:17 +1200, Roland Hill wrote: > Mostly Eterm when at home and KDE's konsole (not gnome). This has > settings of: $TERM = linux Setting TERM=linux for anything other than the Linux console is... > I also use Putty on MS boxes. Not at one now but I have the $TERM set > as 'linux' from memory. ...an heresy. Modern PuTTY supports 256 colors, the best setting is TERM=putty-256color. More precisely, set this value to the "terminal type string" in PuTTY config, so it gets auto-exported. You'll also need there to set "translation" to UTF-8, to match your locale's charset. On Friday, May 11, 2007 at 23:04:34 -0600, Kyle Wheeler wrote: > for konsole, I've been told that it works best with TERM=xterm-color Beware: I'm not sure of the following. I've been told that the most recent Konsoles mimic modern Xterms very closely: 256 colors, extended key combinations, and more. In the absence of an up-to-date dedicated terminfo entry, it may seem that TERM=xterm-256color could work well, especially for Vim. Bye!Alain. -- Mutt muttrc tip to send mails in best adapted first necessary and sufficient charset (version for East Europe Latin-2/CP-852/CP-1250 terminal users): set send_charset="us-ascii:iso-8859-1:iso-8859-15:windows-1252:iso-8859-2:windows-1250:utf-8"
Re: Charset issue?
On Sun, 13 May 2007 or thereabouts, Alain Bench came forth with: > On Saturday, May 12, 2007 at 14:37:14 +1200, Roland Hill wrote: > > $ echo $LANG > > en_NZ.UTF-8 > OK: For now, keep this one. And don't set $charset in muttrc: It is > supposed to take the good value automagically. Type ":set ?charset" > directly in Mutt to check it says charset="utf-8". Then take a look at > the garbled messages. > > LANGUAGE=en_NZ:en > That is a valid value: A colon-separated list of locales in prefered > order for messages. A sort-of super-LC_MESSAGES. However your setting is > not very interesting, so I'd tend to advice to unset it, and wipe it > from whatever startup file(s) do set it. But you can do as you like. Alain, I'll jump in at this point and say that I have made some changes since we started this in the interest of me trying to help myself. I posted a "SOLVED" followup, but maybe it isn't? I hope this hasn't wasted your time. Garbled messages are now fine. > Et merde! Sorry Roland, my mistake: I managed to give you a syntax > that your printf didn't like. Please excuse me, and run this instead: > | $ printf "L1: won\xB4t \xA8reply\xA8\nU8: won\xC2\xB4t > \xC2\xA8reply\xC2\xA8\n" Not near my server to produce pretty pictures, but on the: - L1 line I get "won't" correctly followed by a quote mark. - U8 line I get "won" followed by an "A with a caret on top" followed by an apostrophe and a "t". I then get "A with caret" quote mark "reply" "A with caret" quote mark. [snip all good TERM stuff I need to follow up on] Thanks again Alain. I have learnt something again and hope that this is solved, or nearly solved. -- Regards, Roland PGP Key 0xDA39319B = BCF0 1214 BAE9 5A3D 46FC 21A6 360D 9398 DA39 319B signature.asc Description: Digital signature
Re: Charset issue?
This has been an interesting thread to follow for someone who still hasn't gotten his head around the locale/encoding stuff. The following: - U8 line I get "won" followed by an "A with a caret on top" followed by an apostrophe and a "t". I then get "A with caret" quote mark "reply" "A with caret" quote mark. Made me think: I would have loved to see a website where all these common encoding errors were collected in a table or a database, where one could check: "there are 'As with carets on top' in my message where I expected this and that character" [lookup in table] "Aha the message is in encoding x but my reader thinks it's y". In other words: some kind of cheat sheet "if you have many characters like this, the error is most likely this" etc. Anyone know of anything like that? Eyolf -- Catching his children with their hands in the new, still wet, patio, the father spanked them. His wife asked, "Don't you love your children?" "In the abstract, yes, but not in the concrete."
Re: Locale problem and sent index
On Sat, May 12, 2007 at 12:01:14AM +0200, Alain Bench wrote: > > results in an ? substituting the 'å' in 'Salve Håkedal' in the > > recievers inbox. Hmmm. When viewed in Mutt, I see a multibyte, centered dot, but in my editor (nvi-m17n) I see an "a:", a multibyte character that looks like an "a" with two dots above it. I'm very embarassed, but after all these years I still cannot get mutt to display all the characters that I should be able to view. "nvi-m17n" does it with such "ease". (Just a comment out of frustration. I've set and reset all the charset-related settings time and time again. The one that works most of the time is in place now.) -- henry nelson WWW_HOME=http://yuba(dot)ne(dot)jp/(tilde)home/
Re: Locale problem and sent index
On Mon, May 14, 2007 at 07:00:11AM +0900, Henry Nelson wrote: > On Sat, May 12, 2007 at 12:01:14AM +0200, Alain Bench wrote: > > > results in an ? substituting the 'å' in 'Salve Håkedal' in the > > > recievers inbox. > > Hmmm. When viewed in Mutt, I see a multibyte, centered dot, but in > my editor (nvi-m17n) I see an "a:", a multibyte character that looks > like an "a" with two dots above it. What's even more frustrating is that after using nvi-m17n to edit/reply to the mail, now in mutt the charater appears as "\xe5", but it is still viewable in the editor. -- henry nelson WWW_HOME=http://yuba(dot)ne(dot)jp/(tilde)home/
Re: Locale problem and sent index
On Mon, May 14, 2007 at 07:00:11AM +0900, Henry Nelson wrote: > On Sat, May 12, 2007 at 12:01:14AM +0200, Alain Bench wrote: > > > results in an ? substituting the 'å' in 'Salve Håkedal' in the > > > recievers inbox. > > Hmmm. When viewed in Mutt, I see a multibyte, centered dot, but in > my editor (nvi-m17n) I see an "a:", a multibyte character that looks > like an "a" with two dots above it. > > I'm very embarassed, but after all these years I still cannot get mutt > to display all the characters that I should be able to view. "nvi-m17n" > does it with such "ease". (Just a comment out of frustration. I've > set and reset all the charset-related settings time and time again. The > one that works most of the time is in place now.) > Allowing for the vagaries of fonts, (to me, it looks an "a with ring above"), this sounds as if nvi-m17n is the problem. ĸen -- das eine Mal als Tragödie, das andere Mal als Farce
Re: Locale problem and sent index
On Sun, May 13, 2007 at 11:34:08PM +0100, Ken Moffat wrote: > On Mon, May 14, 2007 at 07:00:11AM +0900, Henry Nelson wrote: > > On Sat, May 12, 2007 at 12:01:14AM +0200, Alain Bench wrote: > > > > results in an ? substituting the 'å' in 'Salve Håkedal' in the > > > > recievers inbox. > > > > Hmmm. When viewed in Mutt, I see a multibyte, centered dot, but in > > my editor (nvi-m17n) I see an "a:", a multibyte character that looks > > like an "a" with two dots above it. [...] > > Allowing for the vagaries of fonts, (to me, it looks an "a with > ring above"), this sounds as if nvi-m17n is the problem. The problem with this particular glyph is my poor eyesight. Looking more closely, it does indeed appear to be a "lower case 'a' with a flattened ring above it." > das eine Mal als Tragödie, das andere Mal als Farce Here is exactly the same problem: a multibyte, centered dot, no different from 'å', when I read the the mail in mutt, and a multibyte that looks like a lower case 'o' with a horizontal line above it in $editor, nvi-m17n, as I compose this reply. Personally, I _guess_ the "problem" is that iconv can only handle "textbook" charset conversions, whereas nvi-m17n has finely tuned heuristics to handle the myriad of exceptions/abuses prevalent in "real-world" Japan. -- henry nelson WWW_HOME=http://yuba(dot)ne(dot)jp/(tilde)home/
Re: Locale problem and sent index
On Sun, May 13, 2007 at 11:34:08PM +0100, Ken Moffat wrote: > > > > results in an ? substituting the '$(D+)(B' in 'Salve H$(D+)(Bkedal' > > > > in the [...] > $(D)G(Ben [...] > das eine Mal als Trag$(D+S(Bdie, das andere Mal als Farce When reading the mail in Mutt, all three of the non-ascii characters appear exactly the same (a dot in the center of a double-width space), whereas as I enter this reply in $editor, I see a " 'a' with ring above", "K" and " 'o' with horizontal line above". -- henry nelson WWW_HOME=http://yuba(dot)ne(dot)jp/(tilde)home/
Re: Locale problem and sent index
On Mon, May 14, 2007 at 11:22:26AM +0900, Henry Nelson wrote: > When reading the mail in Mutt, all three of the non-ascii characters > appear exactly the same (a dot in the center of a double-width space), > whereas as I enter this reply in $editor, I see a " 'a' with ring above", > "K" and " 'o' with horizontal line above". If this is hard to visualize, try: "http://yuba.ne.jp/~home/multibyte.html";. -- henry nelson WWW_HOME=http://yuba(dot)ne(dot)jp/(tilde)home/