On 21 Apr 2004, at 16:54, Dan Sugalski wrote:

Woohoo! Cool, and thanks very much.

No problem. I can't find someone to come on-board yet, but I did get an answer to your question.


If he's up for it, could you ask him a question? Namely "Treating all text as Unicode--good idea or bad idea?" If the answer's going to be a lot of work you can skip it, that's OK.

The answer is fairly straight-forward, fortunately.


Talking to Burnhard and perky on HanIRC, I was able to get the following information:

- there are (of course) some character sets that don't work well with Unicode -- for example, Big5HKSCS doesn't encode in UCS2 (though I didn't find out why)

- that being said, the consensus was that internal storage as Unicode is a good idea for modern programming languages and APIs.

- Tcl/Tk's method of per-FH filters for EUC, johab, etc. seems to be useful and well-received.

So in essence, what I got from the conversation was that internal storage as Unicode is a good thing (and indeed, expected), so long as a method for conversion on input/output is provided.

Sorry if that doesn't answer all the nuances of the question, but that's the best I can do for now.

Cheers,

~kj

Reply via email to