On Tue, Feb 28, 2006 at 11:19:02AM -0500, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > >>> This may be the only solution. Converting everything to UTF-8 has > >>> issues because some encodings are not roundtrip-safe > > >> Is this still true? > > > I beleive so. If use the ICU Converter Explorer [1] to examine some of > > the encodings we support, they have "Contains ambiguous aliases? TRUE". > > Which ones, and are they client-only encodings? If all our server-side > encodings are round-trip safe then I think there's no big issue. > > In any case I don't think there's a huge problem if we say that database > and user names had better be chosen from the round-trip-safe subset.
This is what it says here [1]: There are only 19 encodings currently used worldwide as legitimate POSIX multi-byte locale encodings: UTF-8, ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-5, ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-13, ISO-8859-15, EUC-JP, EUC-KR, GB2312 (= EUC-CN), KOI8-R, KOI8-U, VISCII, WINDOWS-1251, WINDOWS-1256 Each of these is fully roundtrip compatible to ISO 10646, therefore all these locales can be represented nicely in wchar_t as the equivalent UCS values. The above names and the corresponding defining documents are listed in the IANA charset registry. Some of these have multiple definitions according to ICU meaning that different platforms have implemented them differently in the past (EUC-JP falls into this catagory), but presumably the IANA charset registry has proper definitions. Of the reminaing encodings we support, Big5 is OK, although the term win-950 which is the windows version has changed over time. GBK has same problem, win-936 has changed to over time. I don't think we should concern ourselves with bugs in the windows encodings. IOW, I think we are mostly safe. [1] http://www.cl.cam.ac.uk/~mgk25/ucs/iso2022-wc.html -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
signature.asc
Description: Digital signature