Ok,.. tested on debian sarge with ICU 3.2 UNICODE Database, C locale. upper() and lower() returns an empty string for any input, including 7bit ascii, regardless of client_encoding, so something is obviously broken.
Have you tested this patch on a UNICODE DB with locale C/POSIX ? ... John > -----Original Message----- > From: John Hansen > Sent: Friday, March 25, 2005 10:27 PM > To: 'Palle Girgensohn'; 'pgsql-hackers@postgresql.org' > Subject: RE: [HACKERS] Patch for collation using ICU > > > --On fredag, mars 25, 2005 16.34.41 +1100 John Hansen > > <[EMAIL PROTECTED]> > > wrote: > > > > > Useful if it's going to support earlier releases of ICU.... > > > > > > Not all os's come with ICU3.2, debian for example, > > currently has 2.1 > > > in testing, and 2.6 in unstable. > > > > Oh, OK. FreeBSD has only the 3.2 as port. I can check the older > > version, I doubt it would too much difference. Some > autoconf sorcery > > needed, perhaps. > > Naww, it's no biggie, we'll just need to include ICU with pg I think. > I tried that, there are several functions from ICU that you > use, that are not in ICU2.1 > > Dono about 2.6. > > However, ICU3.2 compiles on debian with a small change to the > debian/rules file. > debian/tmp/etc is missing, so add mkdir debian/tmp/etc > > ... John > > > > > /Palle > > > > > > > > ... John > > > > > >> -----Original Message----- > > >> From: [EMAIL PROTECTED] > > >> [mailto:[EMAIL PROTECTED] On Behalf Of Palle > > >> Girgensohn > > >> Sent: Friday, March 25, 2005 10:40 AM > > >> To: pgsql-hackers@postgresql.org > > >> Subject: [HACKERS] Patch for collation using ICU > > >> > > >> Hi! > > >> > > >> I've put together a patch for using IBM's ICU package for > > collation. > > >> > > >> If your OS does not have full support for collation ur > > >> uppercase/lowercase in multibyte locales, this might be > useful. If > > >> you are using a multibyte character encoding in your > database and > > >> want collation, i.e. order by, and also lower(), upper() and > > >> initcap() to work properly, this patch will do just that. > > >> > > >> This patch is needed for FreeBSD, since this OS has no > support for > > >> collation of for example unicode locales (that is, > wcscoll(3) does > > >> not do what you expect if you set LC_ALL=sv_SE.UTF-8, for > > example). > > >> AFAIK the patch is *not* necessary for Linux, although IBM > > claims ICU > > >> collation to be about twice as fast as glibc for simple western > > >> locales. > > >> > > >> It adds a configure switch, `--with-icu', which will set > > up the code > > >> to use ICU instead of wchar_t and wcscoll. > > >> > > >> This has been tested only on FreeBSD-4.11 & > > FreeBSD-5-stable, where > > >> it seems to run well. I've not had the time to do any > comparative > > >> performance tests yet, but it seems it is at least not > slower than > > >> using LATIN1 with > > >> sv_SE.ISO8859-1 locale, perhaps even faster. > > >> > > >> I'd be delighted if some more experienced postgresql > hackers would > > >> review this stuff. The patch is pretty compact, so it's > > fast reading > > >> :) I'm planning to add this patch as an option (tagged > > >> "experimental") to FreeBSD's postgresql port. Any ideas > > about whether > > >> this is a good idea or not? > > >> > > >> Any thoughts or ideas are welcome! > > >> > > >> Cheers, > > >> Palle > > >> > > >> Patch at: > > >> <http://people.freebsd.org/~girgen/postgresql-icu/pg-801-icu-2 > > > 005-03-14.diff> > > >> > > >> ICU at sourceforge: <http://icu.sf.net/> > > >> > > >> > > >> ---------------------------(end of > > >> broadcast)--------------------------- > > >> TIP 7: don't forget to increase your free space map settings > > >> > > >> > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend