>Thank you very much Ken for acknowledging this issue. Now I need to 
>find out where it comes from. I am sending purely with exmh Sedit
>from my Mac with a French keyboard, so somewhere something seems to go 
>wrong in what the OS sends and what Sedit receives, do I get this right?

Yeah, exactly.  What ends up in the draft is definitely the UTF-8
replacement character.

>Any way I can diagnose this? What would be appropriate tools? I am 
>simply switching from American keyboard layout to French keyboard 
>layout using the MacOS system function for this and in Sedit indeed I 
>see when I am composing the proper character, but maybe for the wrong 
>reasons and maybe it still has the wrong code.

What I would FIRST do is make sure our understanding of what is going
wrong is correct.  Create a draft with some accents in Sedit, save it,
and then use your favorite text editor to view the draft file and
make sure that we are correct in my assertion that the draft file has
it as the unknown character.

>How do I control how these special characters are encoded by ?my OS? 
>(or Sedit?).. sorry if this is a dumb question..

Ooof, that is a tough question.  I do not have a good answer.

My exmh editor is "xterm -e vi".  I know what when I input a character
with an accent or some other character, X sees the right thing, xterm
gets the right event, and then vi puts the UTF-8 in the draft file.

What happens with Sedit ... well, the problem is Sedit was written way
before i18n was a huge issue and I know Tcl/Tk has i18n support I do not
know if Sedit makes use of it.  It's gonna be a mess to figure out, that's
for sure.

I have some more unfortunate news for you; in your reply to me you show
this quoted line:

>> >Dear �mile,

That's U+00EF U+00BF U+00BD encoded as UTF-8.  So as part of your
reply process whatever you did interpreted the UTF-8 as ISO-8859-1
characters and then re-encoded those characters as UTF-8.  Even
stranger below:

>> So this shows the problem right here; I see what SHOULD be a � as the

What I sent out as a É was re-encoded by you as the UTF-8 unknown character
symbol.  I am assuming that happened because É is encoded as xC3 x89 and
x89 is not a valid character in ISO-8859-1.  I am a little suprised it's
not �, but that's a minor issue.  My point is that I am assuming you
also want REPLIES to be correctly encoded with accents and I don't think
that is happening now (it might be correct if someone sends you something
encoded with ISO-8859-1).

--Ken

_______________________________________________
Exmh-users mailing list
Exmh-users@redhat.com
https://listman.redhat.com/mailman/listinfo/exmh-users

Reply via email to