Re: [HACKERS] Patch for collation using ICU

2005-05-10 Thread John Hansen
Tatsuo Ishii wrote: > Sent: Tuesday, May 10, 2005 5:45 PM > To: John Hansen > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > > Tatsuo Ishii wrote: > > > Sent: Tuesday, May

Re: [HACKERS] Patch for collation using ICU

2005-05-10 Thread Tatsuo Ishii
> Tatsuo Ishii wrote: > > Sent: Tuesday, May 10, 2005 12:32 AM > > To: John Hansen > > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; > > pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] Patch for collation using ICU > > > > > > --

Re: [HACKERS] Patch for collation using ICU

2005-05-09 Thread John Hansen
Tatsuo Ishii wrote: > Sent: Tuesday, May 10, 2005 12:32 AM > To: John Hansen > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > > > -Original Message- > > > From: T

Re: [HACKERS] Patch for collation using ICU

2005-05-09 Thread Tatsuo Ishii
> > -Original Message- > > From: Tatsuo Ishii [mailto:[EMAIL PROTECTED] > > Sent: Sunday, May 08, 2005 11:08 PM > > To: John Hansen > > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; > > pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] Pa

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen
> -Original Message- > From: Tatsuo Ishii [mailto:[EMAIL PROTECTED] > Sent: Sunday, May 08, 2005 11:08 PM > To: John Hansen > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU &g

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen
Tatsuo Ishii wrote: > Sent: Sunday, May 08, 2005 11:08 PM > To: John Hansen > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > > > I don't buy it. If current conversio

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen
Tom Lane wrote: > Sent: Monday, May 09, 2005 2:47 AM > To: Palle Girgensohn > Cc: Tatsuo Ishii; John Hansen; [EMAIL PROTECTED]; > pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > Palle Girgensohn <[EMAIL PROT

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen
Tatsuo Ishii wrote: > Sent: Sunday, May 08, 2005 11:19 PM > To: John Hansen > Cc: [EMAIL PROTECTED]; pgman@candle.pha.pa.us; > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > > > > > On Sun, May 08, 2005

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Palle Girgensohn
> Palle Girgensohn <[EMAIL PROTECTED]> writes: >>> I'm confused. I thought the ICU patches is intended for using on >>> broken locale platforms? > >> It will sort correctly in *one* locale, using ICU. You still cannot mix >> different locales in the same database cluster, the collation locale is >>

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Andrew Dunstan
Magnus Hagander wrote: The source for ICU 3.2 is 9.8Mb in .tar.gz. PostgreSQL 8.0.2 is 13.2. That means the size of the distribution would almost *double* if we bundled ICU. It's probably fine bundling it in the binary distributions (at least we'd probably do it on win32, since not many ppl will h

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Tom Lane
Palle Girgensohn <[EMAIL PROTECTED]> writes: >> I'm confused. I thought the ICU patches is intended for using on >> broken locale platforms? > It will sort correctly in *one* locale, using ICU. You still cannot mix > different locales in the same database cluster, the collation locale is > still

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Tatsuo Ishii
> > > > On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote: > > > > > Tatsuo Ishii wrote: > > > > > > > > > > So Japanese(including ASCII)/UNICODE behavior is > > > > perfectly correct > > > > > > at this moment. > > > > > > > > > > Right, so you _never_ use accented ascii characters in

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Palle Girgensohn
--On söndag, maj 08, 2005 22.19.25 +0900 Tatsuo Ishii <[EMAIL PROTECTED]> wrote: > > > On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote: > > > > Tatsuo Ishii wrote: > > > > > > > > So Japanese(including ASCII)/UNICODE behavior is > > > perfectly correct > > > > > at this moment. > > >

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Tatsuo Ishii
> > I don't buy it. If current conversion tables does the right > > thing, why we need to replace. Or if conversion tables are > > not correct, why don't you fix it? I think the rule of > > character conversion will not change frequently, especially > > for LATIN languages. Thus maintaining cos

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen
> The source for ICU 3.2 is 9.8Mb in .tar.gz. PostgreSQL 8.0.2 is 13.2. > That means the size of the distribution would almost *double* > if we bundled ICU. Ermm,. Don't forget to remove the current charset conversions and locale support before making your size estimation. > > It's probably fin

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Magnus Hagander
>> The 3.2 vs 2.8 business is disturbing also; specifically, I >> don't think we get to require 3.2 on a platform where 2.8 is >> installed. > >There seems to be nothing in the ICU licence that would prevent us from >bundling it. >This would solve both the 3.2 vs 2.8 problems, and would remove th

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Magnus Hagander
>> Is this patch ready for application? >> >> >http://people.freebsd.org/~girgen/postgresql-icu/pg-802-icu-200 >5-05-06.d >> iff.gz >> >> The web site is: >> >> http://people.freebsd.org/~girgen/postgresql-icu/readme.html > >I don't think so, not quite. I have not had any positive >repo

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen
Tatsuo Ishii > Sent: Sunday, May 08, 2005 3:41 PM > To: John Hansen > Cc: [EMAIL PROTECTED]; pgman@candle.pha.pa.us; > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > > Alvaro Herrera wrote: > > > S

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Tatsuo Ishii
> Alvaro Herrera wrote: > > Sent: Sunday, May 08, 2005 2:49 PM > > To: John Hansen > > Cc: Tatsuo Ishii; pgman@candle.pha.pa.us; > > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] Patch for collation using ICU > > > > O

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
Alvaro Herrera wrote: > Sent: Sunday, May 08, 2005 2:49 PM > To: John Hansen > Cc: Tatsuo Ishii; pgman@candle.pha.pa.us; > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > On Sun, May 08, 2005 at 02:07:29PM +10

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Alvaro Herrera
On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote: > Tatsuo Ishii wrote: > > So Japanese(including ASCII)/UNICODE behavior is perfectly > > correct at this moment. > > Right, so you _never_ use accented ascii characters in Japanese? > (like è for example, whose uppercase is È) That

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
Tatsuo Ishii wrote: > Sent: Sunday, May 08, 2005 10:09 AM > To: John Hansen > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > > Bruce Momjian wrote: > > > > > > T

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
> I don't buy it. If current conversion tables does the right > thing, why we need to replace. Or if conversion tables are > not correct, why don't you fix it? I think the rule of > character conversion will not change frequently, especially > for LATIN languages. Thus maintaining cost is not t

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
Tom Lane wrote: > "John Hansen" <[EMAIL PROTECTED]> writes: > > Btw, I had been planning to propose replacing every single > one of the > > built in charset conversion functions with calls to ICU > (thus making > > pg _depend_ on ICU), > > I find that fairly unacceptable ... especially given t

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Tatsuo Ishii
> Palle Girgensohn wrote: > > > > --On l?rdag, maj 07, 2005 23.15.29 +1000 John Hansen <[EMAIL PROTECTED]> > > wrote: > > > > > Btw, I had been planning to propose replacing every single one of the > > > built in charset conversion functions with calls to ICU (thus making pg > > > _depend_ on IC

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Tatsuo Ishii
> Bruce Momjian wrote: (B> > (B> > There are two reasons for that optimization --- first, some (B> > locale support is broken and Unicode encoding with a C locale (B> > crashes (not an issue for ICU), and second, it is an (B> > optimization for languages like Japanese that want to use (B

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Tom Lane
"John Hansen" <[EMAIL PROTECTED]> writes: > Btw, I had been planning to propose replacing every single one of the > built in charset conversion functions with calls to ICU (thus making > pg _depend_ on ICU), I find that fairly unacceptable ... especially given the licensing questions, but in any c

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
Tom Lane wrote: > "John Hansen" <[EMAIL PROTECTED]> writes: > > Where'd you get the licence from? > > It was the first thing I came across in their docs: > > http://icu.sourceforge.net/userguide/intro.html > > Looking more closely, it may be that this license is only > intended to apply to the

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
--On lördag, maj 07, 2005 10.58.09 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: "John Hansen" <[EMAIL PROTECTED]> writes: Where'd you get the licence from? It was the first thing I came across in their docs: http://icu.sourceforge.net/userguide/intro.html Looking more closely, it may be that this lice

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Tom Lane
"John Hansen" <[EMAIL PROTECTED]> writes: > Where'd you get the licence from? It was the first thing I came across in their docs: http://icu.sourceforge.net/userguide/intro.html Looking more closely, it may be that this license is only intended to apply to the documentation and not the code ...

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Bruce Momjian
John Hansen wrote: > Bruce Momjian wrote: > > > > There are two reasons for that optimization --- first, some > > locale support is broken and Unicode encoding with a C locale > > crashes (not an issue for ICU), and second, it is an > > optimization for languages like Japanese that want to use

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
--On lördag, maj 07, 2005 10.06.43 -0400 Bruce Momjian wrote: Palle Girgensohn wrote: --On l?rdag, maj 07, 2005 23.15.29 +1000 John Hansen <[EMAIL PROTECTED]> wrote: > Btw, I had been planning to propose replacing every single one of the > built in charset conversion functions with calls to IC

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
Bruce Momjian wrote: > Palle Girgensohn wrote: > > > > --On l?rdag, maj 07, 2005 23.15.29 +1000 John Hansen > > <[EMAIL PROTECTED]> > > wrote: > > > > > Btw, I had been planning to propose replacing every single one of > > > the built in charset conversion functions with calls to ICU (thus > >

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Bruce Momjian
Palle Girgensohn wrote: > >> This is because in the standard postgres implementation, upper/lower is > >> done one character at the time. A proper upper/lower cannot do it that > >> way. Other known example is in Turkish, where an ? (?) should look > >> different whether it is an initial letter o

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
--On lördag, maj 07, 2005 09.52.59 -0400 Bruce Momjian wrote: Palle Girgensohn wrote: >> Also, apparently, ICU is installed by default in many linux >> distributions, and usually it is version 2.8. Some linux users have >> asked me if there are plans for a patch that works with ICU 2.8. >> Th

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
> It seems 3.2 has much more support and bug fixes, I'm not > sure if we should really consider 2.8? As I said, probably not worth the effort. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.post

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Bruce Momjian
Palle Girgensohn wrote: > > --On l?rdag, maj 07, 2005 23.15.29 +1000 John Hansen <[EMAIL PROTECTED]> > wrote: > > > Btw, I had been planning to propose replacing every single one of the > > built in charset conversion functions with calls to ICU (thus making pg > > _depend_ on ICU), as this woul

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
Bruce Momjian wrote: > > There are two reasons for that optimization --- first, some > locale support is broken and Unicode encoding with a C locale > crashes (not an issue for ICU), and second, it is an > optimization for languages like Japanese that want to use > unicode, but don't need a lo

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Bruce Momjian
Andrew Dunstan wrote: > > > John Hansen wrote: > > >Here is the list of encoding names and aliases the ICU accepts as of > >3.2: > >(it's a bit long...) > > > >UTF-8 ibm-1208 ibm-1209 ibm-5304 ibm-5305 windows-65001 cp1208 > >UTF-16 ISO-10646-UCS-2 unicode csUnicode ucs-2 > > > > > > > [snip]

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Bruce Momjian
Palle Girgensohn wrote: > >> Also, apparently, ICU is installed by default in many linux > >> distributions, and usually it is version 2.8. Some linux users have > >> asked me if there are plans for a patch that works with ICU 2.8. That's > >> probably a good idea. IBM and the ICU folks seem to

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
> -Original Message- > From: Andrew Dunstan [mailto:[EMAIL PROTECTED] > Sent: Saturday, May 07, 2005 11:39 PM > To: John Hansen > Cc: Palle Girgensohn; Bruce Momjian; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > >

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Bruce Momjian
John Hansen wrote: > > --On l?rdag, maj 07, 2005 22.53.46 +1000 John Hansen > > <[EMAIL PROTECTED]> > > wrote: > > > > > Errm,... initdb --encoding UNICODE --locale C > > > > You mean that ICU *shall* be used even for the C locale, and > > not as Bruce suggested here: > > Yes, that's exactly w

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
> Did you try the latest patch? Maybe it will help, and if not, it will > (hopefully) give a lot more informative error messages. No, and I got rid of my debian boxes @ home. The patch required a certain amount of modifications too, to even compile with 2.8. So I guess it's a valid question to as

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Andrew Dunstan
John Hansen wrote: Here is the list of encoding names and aliases the ICU accepts as of 3.2: (it's a bit long...) UTF-8 ibm-1208 ibm-1209 ibm-5304 ibm-5305 windows-65001 cp1208 UTF-16 ISO-10646-UCS-2 unicode csUnicode ucs-2 [snip] Don't we use "unicode" as an alias for UTF-8 ? cheers andrew ---

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
--On lördag, maj 07, 2005 23.33.31 +1000 John Hansen <[EMAIL PROTECTED]> wrote: -Original Message- From: Palle Girgensohn [mailto:[EMAIL PROTECTED] Sent: Saturday, May 07, 2005 11:33 PM To: John Hansen; Bruce Momjian Cc: pgsql-hackers@postgresql.org Subject: RE: [HACKERS] Pat

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
nal Message- From: John Hansen Sent: Saturday, May 07, 2005 11:09 PM To: 'Palle Girgensohn'; 'Bruce Momjian' Cc: 'pgsql-hackers@postgresql.org' Subject: RE: [HACKERS] Patch for collation using ICU > --On lördag, maj 07, 2005 22.53.46 +1000 John Hansen >

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
> -Original Message- > From: Palle Girgensohn [mailto:[EMAIL PROTECTED] > Sent: Saturday, May 07, 2005 11:33 PM > To: John Hansen; Bruce Momjian > Cc: pgsql-hackers@postgresql.org > Subject: RE: [HACKERS] Patch for collation using ICU > > > > --On lörda

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
--On lördag, maj 07, 2005 22.22.52 +1000 John Hansen <[EMAIL PROTECTED]> wrote: I use this patch in production on one FreeBSD 4.10 server at the moment. With the latest version, I've had no problems. Logging is swithed on for now, and it shows no signs of ICU complaining. I'd like more reports o

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
> -Original Message- > From: Palle Girgensohn [mailto:[EMAIL PROTECTED] > Sent: Saturday, May 07, 2005 11:30 PM > To: John Hansen; Bruce Momjian > Cc: pgsql-hackers@postgresql.org > Subject: RE: [HACKERS] Patch for collation using ICU > > > > --On lörda

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
--On lördag, maj 07, 2005 23.25.15 +1000 John Hansen <[EMAIL PROTECTED]> wrote: Palle Girgensohn wrote: I'm aware of that. It might help for unicode, but there are a bunch of other encodings. IANA has decided that utf-8 has *no* aliases, hence only utf-8 (with dash, but case insensitve) is accep

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
Palle Girgensohn wrote: > I'm aware of that. It might help for unicode, but there are a > bunch of > other encodings. IANA has decided that utf-8 has *no* > aliases, hence only > utf-8 (with dash, but case insensitve) is accepted. Perhaps ICU is > fogiving, I don't remember/know, but I think w

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
--On lördag, maj 07, 2005 08.37.05 -0400 Bruce Momjian wrote: Palle Girgensohn wrote: > > Is this patch ready for application? I don't think so, not quite. I have not had any positive reports from linux users, this is only tested in a FreeBSD environment. I'd say it needs some more testing. O

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
we do not have at present. Any thoughts? ... John > -Original Message- > From: John Hansen > Sent: Saturday, May 07, 2005 11:09 PM > To: 'Palle Girgensohn'; 'Bruce Momjian' > Cc: 'pgsql-hackers@postgresql.org' > Subject: RE: [HACKERS]

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
age- > >> From: [EMAIL PROTECTED] > >> [mailto:[EMAIL PROTECTED] On Behalf Of > John Hansen > >> Sent: Saturday, May 07, 2005 10:23 PM > >> To: Palle Girgensohn; Bruce Momjian > >> Cc: pgsql-hackers@postgresql.org > >> Subject: Re: [HACK

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
Bruce Momjian wrote: > Palle Girgensohn wrote: > > > > > > Is this patch ready for application? > > > > I don't think so, not quite. I have not had any positive > reports from > > linux users, this is only tested in a FreeBSD environment. > I'd say it > > needs some more testing. > > OK. > >

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
ssage- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John Hansen Sent: Saturday, May 07, 2005 10:23 PM To: Palle Girgensohn; Bruce Momjian Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Patch for collation using ICU > > I use this patch in production on one FreeBS

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
Subject: Re: [HACKERS] Patch for collation using ICU > > > > > I use this patch in production on one FreeBSD 4.10 server at the > > moment. > > With the latest version, I've had no problems. Logging is > swithed on > > for now, and it shows no signs of ICU co

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Bruce Momjian
Palle Girgensohn wrote: > > > > Is this patch ready for application? > > I don't think so, not quite. I have not had any positive reports from linux > users, this is only tested in a FreeBSD environment. I'd say it needs some > more testing. OK. > Also, apparently, ICU is installed by default

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
> > I use this patch in production on one FreeBSD 4.10 server at > the moment. > With the latest version, I've had no problems. Logging is > swithed on for > now, and it shows no signs of ICU complaining. I'd like more > reports on > Linux, though. I currently use this on gentoo with ICU3.2

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
--On fredag, maj 06, 2005 22.57.59 -0400 Bruce Momjian wrote: Is this patch ready for application? http://people.freebsd.org/~girgen/postgresql-icu/pg-802-icu-2005-05-06.d iff.gz The web site is: http://people.freebsd.org/~girgen/postgresql-icu/readme.html I don't think so, not quite.

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread Palle Girgensohn
--On fredag, maj 06, 2005 23.31.20 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: Bruce Momjian writes: Is this patch ready for application? Not until ICU is released under a BSD license ... It's not GPL anyway. Seems pretty much like the BSD license, at least more BSD-ish than GPL-ish.

Re: [HACKERS] Patch for collation using ICU

2005-05-07 Thread John Hansen
> [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane > Sent: Saturday, May 07, 2005 3:17 PM > To: Bruce Momjian > Cc: Palle Girgensohn; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > Bruce Momjian writes: > > Tom Lane wrote: > &g

Re: [HACKERS] Patch for collation using ICU

2005-05-06 Thread Tom Lane
Bruce Momjian writes: > Tom Lane wrote: >> Not until ICU is released under a BSD license ... > Well, readline isn't BSD either, but we use it. It is any different? Did you read the license? Some of the more troubling bits: : It is the understanding of INTERNATIONAL BUSINESS MACHINES CORPORATI

Re: [HACKERS] Patch for collation using ICU

2005-05-06 Thread John Hansen
Btw, Does it feel right to have pg depend on the bleeding edge version of ICU? On many distro's, even gentoo (known for being bleeding edge) 2.8 is still the default. 2.8 and 3.2 are however incompatible, and supporting both, would bloat the source somewhat. ... John -

Re: [HACKERS] Patch for collation using ICU

2005-05-06 Thread John Hansen
> Why do you need to add a mapping of encoding names from iana > to our names? > The pg encoding names are not recognized by ICU, hence the mappings Install ICU 3.2 on your system, and run uconv -l, that will give you a list of valid ICU encoding names. ... John -

Re: [HACKERS] Patch for collation using ICU

2005-05-06 Thread Andrew - Supernews
On 2005-05-07, Bruce Momjian wrote: > Tom Lane wrote: >> Bruce Momjian writes: >> > Is this patch ready for application? >> >> Not until ICU is released under a BSD license ... > > Well, readline isn't BSD either, but we use it. It is any different? ICU appears to be under the X license, which

Re: [HACKERS] Patch for collation using ICU

2005-05-06 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > Is this patch ready for application? > > Not until ICU is released under a BSD license ... Well, readline isn't BSD either, but we use it. It is any different? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.p

Re: [HACKERS] Patch for collation using ICU

2005-05-06 Thread Tom Lane
Bruce Momjian writes: > Is this patch ready for application? Not until ICU is released under a BSD license ... regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www

Re: [HACKERS] Patch for collation using ICU

2005-05-06 Thread Bruce Momjian
Is this patch ready for application? http://people.freebsd.org/~girgen/postgresql-icu/pg-802-icu-2005-05-06.diff.gz The web site is: http://people.freebsd.org/~girgen/postgresql-icu/readme.html I do have a few questions: Why don't you use the lc_ctype_is_c() part of this test

Re: [HACKERS] Patch for collation using ICU

2005-03-30 Thread Peter Eisentraut
Andrew Dunstan wrote: > [andrew pgsql]$ LC_ALL=en_US.UTF-8 locale charmap > UTF-8 That seems normal. Time to get out the debugger, I suppose. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: Have you ch

Re: [HACKERS] Patch for collation using ICU

2005-03-30 Thread Andrew Dunstan
Peter Eisentraut wrote: Andrew Dunstan wrote: The database cluster will be initialized with locale en_US.UTF-8. initdb: could not find suitable encoding for locale "en_US.UTF-8" What does $ LC_ALL=en_US.UTF-8 locale charmap show? [andrew pgsql]$ LC_ALL=en_US.UTF-8 locale charmap UTF-8 [

Re: [HACKERS] Patch for collation using ICU

2005-03-30 Thread Peter Eisentraut
Palle Girgensohn wrote: > Just a comment: ORDER BY *is* already case sensitive on Linux, since > its strcoll ignores case. I doubt very much it violates SQL > standards. The behavior of collation sequences is implementation-defined. So as long as you can put the behavior in words, it should be O

Re: [HACKERS] Patch for collation using ICU

2005-03-30 Thread Peter Eisentraut
Andrew Dunstan wrote: > The database cluster will be initialized with locale en_US.UTF-8. > initdb: could not find suitable encoding for locale "en_US.UTF-8" What does $ LC_ALL=en_US.UTF-8 locale charmap show? -- Peter Eisentraut http://developer.postgresql.org/~petere/ --

Re: [HACKERS] Patch for collation using ICU

2005-03-29 Thread Palle Girgensohn
--On söndag, mars 27, 2005 04.34.03 +0300 Hannu Krosing <[EMAIL PROTECTED]> wrote: On L, 2005-03-26 at 03:09 +0100, Palle Girgensohn wrote: Hi! ... I've noticed a couple of things about using the ICU patch vs. pristine pg-8.0.1: - ORDER BY is case insensitive when using ICU. This might break the

Re: [HACKERS] Patch for collation using ICU

2005-03-29 Thread Hannu Krosing
On L, 2005-03-26 at 03:09 +0100, Palle Girgensohn wrote: > Hi! > ... > I've noticed a couple of things about using the ICU patch vs. pristine > pg-8.0.1: > > - ORDER BY is case insensitive when using ICU. This might break the SQL > standard (?), but sure is nice :) How does your patch interac

Re: [HACKERS] Patch for collation using ICU

2005-03-28 Thread Palle Girgensohn
--On söndag, mars 27, 2005 20.11.48 +0200 Magnus Hagander <[EMAIL PROTECTED]> wrote: As for general collation of unicode, the reason for me to use ICU is that my system does not support strcoll correctly for multibyte locales, as I mentioned earlier. I also noted that even for systems that do ha

Re: [HACKERS] Patch for collation using ICU

2005-03-27 Thread Magnus Hagander
>As for general collation of unicode, the reason for me to use >ICU is that >my system does not support strcoll correctly for multibyte >locales, as I >mentioned earlier. I also noted that even for systems that do handle >strcoll correctly for unicode, ICU claims to be a couple of magnitudes

Re: [HACKERS] Patch for collation using ICU

2005-03-26 Thread Palle Girgensohn
--On lördag, mars 26, 2005 17.40.01 -0800 Stephan Szabo <[EMAIL PROTECTED]> wrote: On Sun, 27 Mar 2005, Palle Girgensohn wrote: --On lördag, mars 26, 2005 08.16.01 -0800 Stephan Szabo <[EMAIL PROTECTED]> wrote: > On Sat, 26 Mar 2005, Palle Girgensohn wrote: >> I've noticed a couple of things ab

Re: [HACKERS] Patch for collation using ICU

2005-03-26 Thread Stephan Szabo
On Sun, 27 Mar 2005, Palle Girgensohn wrote: > > > --On lördag, mars 26, 2005 08.16.01 -0800 Stephan Szabo > <[EMAIL PROTECTED]> wrote: > > > On Sat, 26 Mar 2005, Palle Girgensohn wrote: > >> I've noticed a couple of things about using the ICU patch vs. pristine > >> pg-8.0.1: > >> > >> - ORDER B

Re: [HACKERS] Patch for collation using ICU

2005-03-26 Thread Palle Girgensohn
--On lördag, mars 26, 2005 13.59.19 +1100 John Hansen <[EMAIL PROTECTED]> wrote: - ORDER BY is case insensitive when using ICU. This might break the SQL standard (?), but sure is nice :) This would mean that indexes are also case insensitive right? Which makes it a Bad Thing(tm). Well, no, not r

Re: [HACKERS] Patch for collation using ICU

2005-03-26 Thread Palle Girgensohn
--On lördag, mars 26, 2005 08.16.01 -0800 Stephan Szabo <[EMAIL PROTECTED]> wrote: On Sat, 26 Mar 2005, Palle Girgensohn wrote: I've noticed a couple of things about using the ICU patch vs. pristine pg-8.0.1: - ORDER BY is case insensitive when using ICU. This might break the SQL standard (?), b

Re: [HACKERS] Patch for collation using ICU

2005-03-26 Thread Stephan Szabo
On Sat, 26 Mar 2005, Palle Girgensohn wrote: > I've noticed a couple of things about using the ICU patch vs. pristine > pg-8.0.1: > > - ORDER BY is case insensitive when using ICU. This might break the SQL > standard (?), but sure is nice :) Err, I think if your system implements strcoll correctly

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread John Hansen
with exit code 139 initdb: removing contents of data directory "/var/lib/postgres/data" ... John > -Original Message- > From: Palle Girgensohn [mailto:[EMAIL PROTECTED] > Sent: Saturday, March 26, 2005 1:10 PM > To: pgsql-hackers@postgresql.org > Cc: John Hansen;

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread John Hansen
> -Original Message- > From: Palle Girgensohn [mailto:[EMAIL PROTECTED] > Sent: Saturday, March 26, 2005 1:10 PM > To: pgsql-hackers@postgresql.org > Cc: John Hansen; Andrew Dunstan > Subject: Re: [HACKERS] Patch for collation using ICU > > --On fredag, mars

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread Palle Girgensohn
--On lördag, mars 26, 2005 10.42.19 +1100 John Hansen <[EMAIL PROTECTED]> wrote: FYI, I also found that initdb crashes with error 139 on any locale other than C/POSIX. Odd, not for me, but I did make a bad assumption about character encoding. Perhaps the new patch will help? (see previous mail)

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread Palle Girgensohn
--On fredag, mars 25, 2005 00.40.04 +0100 Palle Girgensohn <[EMAIL PROTECTED]> wrote: Hi! I've put together a patch for using IBM's ICU package for collation. If your OS does not have full support for collation ur uppercase/lowercase in multibyte locales, this might be useful. If you are using a

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread John Hansen
; > > > ... John > > > >> -----Original Message- > >> From: John Hansen > >> Sent: Friday, March 25, 2005 10:27 PM > >> To: 'Palle Girgensohn'; 'pgsql-hackers@postgresql.org' > >> Subject: RE: [HACKERS] Patch for collation

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread Palle Girgensohn
--On fredag, mars 25, 2005 09.53.38 -0500 Tom Lane <[EMAIL PROTECTED]> wrote: Palle Girgensohn <[EMAIL PROTECTED]> writes: hmm... I think I might have made a false assumption that the locale string would contain the character encoding. You certainly cannot assume that. Would that it were so easy

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread Andrew Dunstan
Tom Lane wrote: Palle Girgensohn <[EMAIL PROTECTED]> writes: hmm... I think I might have made a false assumption that the locale string would contain the character encoding. You certainly cannot assume that. Would that it were so easy to find out the character set for a locale :-(. There

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread Tom Lane
Palle Girgensohn <[EMAIL PROTECTED]> writes: > hmm... I think I might have made a false assumption that > the locale string would contain the character encoding. You certainly cannot assume that. Would that it were so easy to find out the character set for a locale :-(. There's some code in ini

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread Palle Girgensohn
/Palle ... John -Original Message- From: John Hansen Sent: Friday, March 25, 2005 10:27 PM To: 'Palle Girgensohn'; 'pgsql-hackers@postgresql.org' Subject: RE: [HACKERS] Patch for collation using ICU > --On fredag, mars 25, 2005 16.34.41 +1100 John Hansen > <[EMAI

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread John Hansen
> -Original Message- > From: John Hansen > Sent: Friday, March 25, 2005 10:27 PM > To: 'Palle Girgensohn'; 'pgsql-hackers@postgresql.org' > Subject: RE: [HACKERS] Patch for collation using ICU > > > --On fredag, mars 25, 2005 16.34.41 +1100 Jo

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread John Hansen
> --On fredag, mars 25, 2005 16.34.41 +1100 John Hansen > <[EMAIL PROTECTED]> > wrote: > > > Useful if it's going to support earlier releases of ICU > > > > Not all os's come with ICU3.2, debian for example, > currently has 2.1 > > in testing, and 2.6 in unstable. > > Oh, OK. FreeBSD has o

Re: [HACKERS] Patch for collation using ICU

2005-03-25 Thread Palle Girgensohn
--On fredag, mars 25, 2005 16.34.41 +1100 John Hansen <[EMAIL PROTECTED]> wrote: Useful if it's going to support earlier releases of ICU Not all os's come with ICU3.2, debian for example, currently has 2.1 in testing, and 2.6 in unstable. Oh, OK. FreeBSD has only the 3.2 as port. I can check

Re: [HACKERS] Patch for collation using ICU

2005-03-24 Thread John Hansen
Useful if it's going to support earlier releases of ICU Not all os's come with ICU3.2, debian for example, currently has 2.1 in testing, and 2.6 in unstable. ... John > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Palle Girgensohn > Sent: