Em Tue, 29 Jul 2008 14:44:03 -0300, Blower, Andy
<[EMAIL PROTECTED]> escreveu:
Thiago,
Hi!
Sorry I don't understand your objection. Could you expand on it please?
Especially where you say "have a memory and bandwidth penalty using 2
bytes to encode many characters that would be encoded as 1 in UTF-8".
Oooops, typo of mine.
Most Portuguese accented characters are encoded as 2 bytes in UTF-8 and 1
byte in *ISO-8859-1*, AFAIK. So, everytime I write "não" ("no"), UTF-8
spends 4 bytes, ISO-8859-1 spends 3.
In my experience char encoding can be an absolute nightmare and having
as much as possible as UTF-8 is highly desirable. IIRC Java uses UTF-16
internally which does have 2 bytes for each char, but UTF-8 only uses 2
bytes for unusual chars which is why it's the ideal external charset.
Agreed, but in many languages unusual characters (from a speaker of
English or any other languagen without accents) are not unusual, are
frequent.
I hope I worded my ideas better now.
Regarding database encodings, I think got confused. It was not the
ISO-8859-1-encoded database the problem, but ISO-8859-1-encoded Tapestry
templates. Everytime an accented character was submited in a form, I would
get 2 characters unless I added accepted-encoding="iso-8859-1" to every
form tag.
Thiago
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]