On 12/03/2013 11:27 AM, Herbert Duerr wrote:

We should drop our support for ASCII?

UTF-8 contains ASCII. This was one of its most important design goals
and IMHO is a key factor that made this encoding such a big success.
[...]

Hm, UTF-8 is not identical to ASCII.  What if I want to write an
OUString to stdout?  Does a regular printf support UTF-8 or would I need
a conversion from UTF-8 to ASCII for that?

If you have an ASCII string then you can directly print it in an UTF-8 locale. No conversion needed. Also the inverse is true: if that string was encoded as UTF-8 then you can print it directly in an ASCII compatible locale. No conversion needed for the output. The result would be exactly the same.

printf() and friends support the encoding defined by the LC_CTYPE environment variable. Nowadays this is very very often UTF-8, which is backward compatible with ASCII.

Some encodings are not ASCII compatible though, e.g. EBCDIC or DBCS (double-byte character sets). If you printed ASCII text in such environments without converting them first then you'd get gibberish. So if you want to make sure that what you want is what you get then you should always convert to the local encoding as determined by osl_getThreadTextEncoding().

But ASCII and UTF-8 encodings are quite dominant nowadays, especially on developer machines. While we could fix all debug-printing for non-ASCII compatible environments I suggest not to invest too much energy into such a task. The number of developers we'd win by supporting e.g. EBCDIC based development environments vs. the developer investment we'd have to spend to achieve this support would most probably be negative.

I would have said that the ASCII values from 0 to 127 are the same for UTF-8, but, ASCII values greater than 127 are a problem. I recently had a problem with that when a documented contained ASCII 160, a non-breaking space. I became aware of it when I was asked "hey, why does this file look different after it was converted to UTF-8?"

--
Andrew Pitonyak
My Macro Document: http://www.pitonyak.org/AndrewMacro.odt
Info:  http://www.pitonyak.org/oo.php


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org

Reply via email to