Re: What if Tapestry's I18N was just "UTF-8"?

Thiago H. de Paula Figueiredo Tue, 29 Jul 2008 10:54:29 -0700

Em Tue, 29 Jul 2008 14:44:03 -0300, Blower, Andy<[EMAIL PROTECTED]> escreveu:

Thiago,

Hi!

Sorry I don't understand your objection. Could you expand on it please?Especially where you say "have a memory and bandwidth penalty using 2bytes to encode many characters that would be encoded as 1 in UTF-8".


Oooops, typo of mine.

Most Portuguese accented characters are encoded as 2 bytes in UTF-8 and 1byte in *ISO-8859-1*, AFAIK. So, everytime I write "não" ("no"), UTF-8spends 4 bytes, ISO-8859-1 spends 3.

In my experience char encoding can be an absolute nightmare and havingas much as possible as UTF-8 is highly desirable. IIRC Java uses UTF-16internally which does have 2 bytes for each char, but UTF-8 only uses 2bytes for unusual chars which is why it's the ideal external charset.

Agreed, but in many languages unusual characters (from a speaker ofEnglish or any other languagen without accents) are not unusual, arefrequent.


I hope I worded my ideas better now.

Regarding database encodings, I think got confused. It was not theISO-8859-1-encoded database the problem, but ISO-8859-1-encoded Tapestrytemplates. Everytime an accented character was submited in a form, I wouldget 2 characters unless I added accepted-encoding="iso-8859-1" to everyform tag.


Thiago

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: What if Tapestry's I18N was just "UTF-8"?

Reply via email to