On 12/03/2013 11:27 AM, Herbert Duerr wrote:
We should drop our support for ASCII?
UTF-8 contains ASCII. This was one of its most important design goals
and IMHO is a key factor that made this encoding such a big success.
[...]
Hm, UTF-8 is not identical to ASCII. What if I want to write an
OUString to stdout? Does a regular printf support UTF-8 or would I need
a conversion from UTF-8 to ASCII for that?
If you have an ASCII string then you can directly print it in an UTF-8
locale. No conversion needed. Also the inverse is true: if that string
was encoded as UTF-8 then you can print it directly in an ASCII
compatible locale. No conversion needed for the output. The result
would be exactly the same.
printf() and friends support the encoding defined by the LC_CTYPE
environment variable. Nowadays this is very very often UTF-8, which is
backward compatible with ASCII.
Some encodings are not ASCII compatible though, e.g. EBCDIC or DBCS
(double-byte character sets). If you printed ASCII text in such
environments without converting them first then you'd get gibberish.
So if you want to make sure that what you want is what you get then
you should always convert to the local encoding as determined by
osl_getThreadTextEncoding().
But ASCII and UTF-8 encodings are quite dominant nowadays, especially
on developer machines. While we could fix all debug-printing for
non-ASCII compatible environments I suggest not to invest too much
energy into such a task. The number of developers we'd win by
supporting e.g. EBCDIC based development environments vs. the
developer investment we'd have to spend to achieve this support would
most probably be negative.
I would have said that the ASCII values from 0 to 127 are the same for
UTF-8, but, ASCII values greater than 127 are a problem. I recently had
a problem with that when a documented contained ASCII 160, a
non-breaking space. I became aware of it when I was asked "hey, why does
this file look different after it was converted to UTF-8?"
--
Andrew Pitonyak
My Macro Document: http://www.pitonyak.org/AndrewMacro.odt
Info: http://www.pitonyak.org/oo.php
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org