On Mon, Jul 15, 2013 at 8:58 AM, Tatsuo Ishii <is...@postgresql.org> wrote:
> Also I don't understand why you need UTF-16 support as a database
> encoding because UTF-8 and UTF-16 are logically equivalent, they are
> just different represention (encoding) of Unicode. That means if we
> already support UTF-8 (I'm sure we already do), there's no particular
> reason we need to add UTF-16 support.

To be fair, there is a small reason to support UTF-16 even with UTF-8
available. I personally do not find it compelling, but perhaps I am
not best placed to judge such things. As Wikipedia says on the the
English UTF-8 article:

"Characters U+0800 through U+FFFF use three bytes in UTF-8, but only
two in UTF-16. As a result, text in (for example) Chinese, Japanese or
Hindi could take more space in UTF-8 if there are more of these
characters than there are ASCII characters. This happens for pure text
but rarely for HTML documents. For example, both the Japanese UTF-8
and the Hindi Unicode articles on Wikipedia take more space in UTF-16
than in UTF-8."

This is the only advantage of UTF-16 over UTF-8 as a server encoding.
I'm inclined to take the fact that there has been so few (no?)
complaints from PostgreSQL's large Japanese user-base about the lack
of UTF-16 support as suggesting that that isn't considered to be a
compelling feature in the CJK realm.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to