> I think it is really not hard to do this for UTF-8. I don't have to know the > relation between the locale and the encoding. Look at this: > We can use the LC_CTYPE from pg_controldata or alternatively the LC_CTYPE > at server startup. For nearly every locale (de_DE, ja_JP, ...) there exists > also a locale *.utf8 (de_DE.utf8, ja_JP.utf8, ...) at least for the actual Linux >glibc.
My Linux box does not have *.utf8 locales at all. Probably not so many platforms have them up to now, I guess. > We don't need to know more than this. If we call > setlocale(LC_CTYPE, <value of LC_CTYPE extended with .utf8 if not already given>) > then glibc is aware of doing all the conversions. I attach a small demo program > which set the locale ja_JP.utf8 and is able to translate german umlaut A (upper) to > german umlaut a (lower). Interesting idea, but the problem is we have to decide to use exactly one locale before initdb. In my understanding, users willing to use Unicode (UTF-8) tend to use multiple languages. This is natural since Unicode claims it can handle several languages. For example, user might want to have a table like this in a UTF-8 database: create table t1( english text, -- English message germany text, -- Germany message japanese text -- Japanese message ); If you have set the local to, say de_DE, then: select lower(japanese) from t1; would be executed in de_DE.utf8 locale, and I doubt it produces any meaningfull results for Japanese. -- Tatsuo Ishii ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]