On Mon, Jul 15, 2013 at 8:58 AM, Tatsuo Ishii <is...@postgresql.org> wrote: > Also I don't understand why you need UTF-16 support as a database > encoding because UTF-8 and UTF-16 are logically equivalent, they are > just different represention (encoding) of Unicode. That means if we > already support UTF-8 (I'm sure we already do), there's no particular > reason we need to add UTF-16 support.
To be fair, there is a small reason to support UTF-16 even with UTF-8 available. I personally do not find it compelling, but perhaps I am not best placed to judge such things. As Wikipedia says on the the English UTF-8 article: "Characters U+0800 through U+FFFF use three bytes in UTF-8, but only two in UTF-16. As a result, text in (for example) Chinese, Japanese or Hindi could take more space in UTF-8 if there are more of these characters than there are ASCII characters. This happens for pure text but rarely for HTML documents. For example, both the Japanese UTF-8 and the Hindi Unicode articles on Wikipedia take more space in UTF-16 than in UTF-8." This is the only advantage of UTF-16 over UTF-8 as a server encoding. I'm inclined to take the fact that there has been so few (no?) complaints from PostgreSQL's large Japanese user-base about the lack of UTF-16 support as suggesting that that isn't considered to be a compelling feature in the CJK realm. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers