Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2015-03-19 Thread Bruce Momjian
On Sun, Sep 28, 2014 at 10:33:33PM -0400, Bruce Momjian wrote: > On Mon, Sep 15, 2014 at 03:42:20PM -0700, Peter Geoghegan wrote: > > On Mon, Sep 15, 2014 at 12:45 PM, Tom Lane wrote: > > > No. And we don't know how to change the default opclass without > > > breaking things, either. > > > > Is

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-29 Thread Oleg Bartunov
On Mon, Sep 29, 2014 at 11:48 AM, Heikki Linnakangas < hlinnakan...@vmware.com> wrote: > On 09/15/2014 06:28 PM, Alexander Korotkov wrote: > >> Hackers, >> >> some GIN opclasses uses collation-aware comparisons while they don't need >> to do especially collation-aware comparison. Examples are text

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-29 Thread Heikki Linnakangas
On 09/15/2014 06:28 PM, Alexander Korotkov wrote: Hackers, some GIN opclasses uses collation-aware comparisons while they don't need to do especially collation-aware comparison. Examples are text[] and hstore opclasses. Hmm. It would be nice to use the index for inequality searches, at least

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-28 Thread Bruce Momjian
On Mon, Sep 15, 2014 at 03:42:20PM -0700, Peter Geoghegan wrote: > On Mon, Sep 15, 2014 at 12:45 PM, Tom Lane wrote: > > No. And we don't know how to change the default opclass without > > breaking things, either. > > Is there a page on the Wiki along the lines of "things that we would > like to

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-28 Thread Bruce Momjian
On Tue, Sep 16, 2014 at 06:56:24PM +0400, Alexander Korotkov wrote: > On Tue, Sep 16, 2014 at 12:14 PM, Emre Hasegeli wrote: > > > > Changing the default opclasses should work if we make > > > pg_dump --binary-upgrade dump the default opclasses with indexes > > > and exclusion constra

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Alexander Korotkov
On Tue, Sep 16, 2014 at 12:14 PM, Emre Hasegeli wrote: > > > Changing the default opclasses should work if we make > > > pg_dump --binary-upgrade dump the default opclasses with indexes > > > and exclusion constraints. I think it makes sense to do so in > > > --binary-upgrade mode. I can try to

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Emre Hasegeli
> > Changing the default opclasses should work if we make > > pg_dump --binary-upgrade dump the default opclasses with indexes > > and exclusion constraints. I think it makes sense to do so in > > --binary-upgrade mode. I can try to come with a patch for this. > > Can you explain it a bit more d

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Alexander Korotkov
On Tue, Sep 16, 2014 at 11:29 AM, Emre Hasegeli wrote: > Changing the default opclasses should work if we make > pg_dump --binary-upgrade dump the default opclasses with indexes > and exclusion constraints. I think it makes sense to do so in > --binary-upgrade mode. I can try to come with a pat

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Emre Hasegeli
> No. And we don't know how to change the default opclass without > breaking things, either. See previous discussions about how we > might fix the totally-broken default gist opclass that btree_gist > creates for the inet type [1]. The motivation for getting rid of that > is *way* stronger than

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Alexander Korotkov
On Mon, Sep 15, 2014 at 11:45 PM, Tom Lane wrote: > Peter Geoghegan writes: > > On Mon, Sep 15, 2014 at 8:28 AM, Alexander Korotkov > > wrote: > >> Rename such opclasses and make them not default. > >> Create new default opclasses with bitwise comparison functions. > >> Write recommendation to

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-15 Thread Peter Geoghegan
On Mon, Sep 15, 2014 at 12:45 PM, Tom Lane wrote: > No. And we don't know how to change the default opclass without > breaking things, either. Is there a page on the Wiki along the lines of "things that we would like to change if ever there is a substantial change in on-disk format that will bre

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-15 Thread Tom Lane
Peter Geoghegan writes: > On Mon, Sep 15, 2014 at 8:28 AM, Alexander Korotkov > wrote: >> Rename such opclasses and make them not default. >> Create new default opclasses with bitwise comparison functions. >> Write recommendation to re-create indexes with default opclasses into >> documentation.

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-15 Thread Alexander Korotkov
On Mon, Sep 15, 2014 at 10:51 PM, Peter Geoghegan wrote: > On Mon, Sep 15, 2014 at 8:28 AM, Alexander Korotkov > wrote: > > some GIN opclasses uses collation-aware comparisons while they don't > need to > > do especially collation-aware comparison. Examples are text[] and hstore > > opclasses. D

Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-15 Thread Peter Geoghegan
On Mon, Sep 15, 2014 at 8:28 AM, Alexander Korotkov wrote: > some GIN opclasses uses collation-aware comparisons while they don't need to > do especially collation-aware comparison. Examples are text[] and hstore > opclasses. Depending on collation this may make them a much slower. I'm glad that

[HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-15 Thread Alexander Korotkov
Hackers, some GIN opclasses uses collation-aware comparisons while they don't need to do especially collation-aware comparison. Examples are text[] and hstore opclasses. Depending on collation this may make them a much slower. See example. # show lc_collate ; lc_collate ─ ru_RU.UTF

[HACKERS] collation, arrays, and ranges

2011-09-12 Thread Jeff Davis
My interpretation of collation for range types is different than that for arrays, so I'm presenting it here in case someone has an objection. An array type has the same typcollation as its element type. This makes sense, because comparison between arrays are affected by the COLLATE clause. Compar

Re: [HACKERS] collation, arrays, and ranges

2011-09-10 Thread Jeff Davis
On Sat, 2011-09-10 at 13:21 -0400, Tom Lane wrote: > > So, I chose to represent that as a separate > > rngcollation and leave the typcollation 0. In other words, collation is > > a concept internal to that range type and fixed at type definition time. > > Range types are affected by their internal

Re: [HACKERS] collation, arrays, and ranges

2011-09-10 Thread Tom Lane
Jeff Davis writes: > My interpretation of collation for range types is different than that > for arrays, so I'm presenting it here in case someone has an objection. > An array type has the same typcollation as its element type. This makes > sense, because comparison between arrays are affected by

[HACKERS] collation, arrays, and ranges

2011-09-10 Thread Jeff Davis
My interpretation of collation for range types is different than that for arrays, so I'm presenting it here in case someone has an objection. An array type has the same typcollation as its element type. This makes sense, because comparison between arrays are affected by the COLLATE clause. Compar

Re: [HACKERS] Collation mega-cleanups

2011-05-11 Thread Tom Lane
Peter Eisentraut writes: > Seriously, it looks pretty bad, but this is one of the biggest feature > patches in the last 5 years, it touches many places all over the system, > and there is a reason why this topic has been on the TODO list for 10 > years: it's overwhelming. Yeah. I did not want to

Re: [HACKERS] Collation mega-cleanups

2011-05-11 Thread Bruce Momjian
Peter Eisentraut wrote: > from that? The bigger your patch, the lonelier you are. I can attest to that. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers

Re: [HACKERS] Collation mega-cleanups

2011-05-11 Thread Peter Eisentraut
On mån, 2011-05-09 at 14:58 -0400, Bruce Momjian wrote: > Tom this collation stuff has seen more post-feature-commit cleanups > than I think any patch I remember. Is there anything we can learn > from this? Don't do big patches? Seriously, it looks pretty bad, but this is one of the biggest feat

Re: [HACKERS] Collation mega-cleanups

2011-05-10 Thread Tom Lane
"Ross J. Reedstrom" writes: > So perhaps it was more of the "This code is less ready than I thought > it was, but now that I've spent the time understanding it and the > problem, the shortest way out is forward". Yeah, exactly. By the time I really understood how incomplete the collation patch w

Re: [HACKERS] Collation mega-cleanups

2011-05-10 Thread Ross J. Reedstrom
On Tue, May 10, 2011 at 07:21:16PM +0200, Andres Freund wrote: > On Tuesday, May 10, 2011 07:08:23 PM Ross J. Reedstrom wrote: > > On Mon, May 09, 2011 at 03:57:12PM -0400, Robert Haas wrote: > > > On Mon, May 9, 2011 at 2:58 PM, Bruce Momjian wrote: > > > > Tom this collation stuff has seen more

Re: [HACKERS] Collation mega-cleanups

2011-05-10 Thread Andres Freund
On Tuesday, May 10, 2011 07:08:23 PM Ross J. Reedstrom wrote: > On Mon, May 09, 2011 at 03:57:12PM -0400, Robert Haas wrote: > > On Mon, May 9, 2011 at 2:58 PM, Bruce Momjian wrote: > > > Tom this collation stuff has seen more post-feature-commit cleanups > > > than I think any patch I remember.

Re: [HACKERS] Collation mega-cleanups

2011-05-10 Thread Ross J. Reedstrom
On Mon, May 09, 2011 at 03:57:12PM -0400, Robert Haas wrote: > On Mon, May 9, 2011 at 2:58 PM, Bruce Momjian wrote: > > Tom this collation stuff has seen more post-feature-commit cleanups than > > I think any patch I remember.  Is there anything we can learn from this? > > How about "don't commit

Re: [HACKERS] Collation mega-cleanups

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 2:58 PM, Bruce Momjian wrote: > Tom this collation stuff has seen more post-feature-commit cleanups than > I think any patch I remember.  Is there anything we can learn from this? How about "don't commit all the large patches at the end of the cycle"? -- Robert Haas Enter

Re: [HACKERS] Collation mega-cleanups

2011-05-09 Thread Tom Lane
Bruce Momjian writes: > Tom this collation stuff has seen more post-feature-commit cleanups than > I think any patch I remember. Is there anything we can learn from this? The pre-commit review was obviously woefully inadequate. regards, tom lane -- Sent via pgsql-hacke

[HACKERS] Collation mega-cleanups

2011-05-09 Thread Bruce Momjian
Tom this collation stuff has seen more post-feature-commit cleanups than I think any patch I remember. Is there anything we can learn from this? Yes, this is coming from me, who some consider to be the king of post-commit cleanups, namely, cleaning up my own commits. ---

Re: [HACKERS] Collation patch's handling of wcstombs/mbstowcs is sheerest fantasy

2011-04-24 Thread Peter Eisentraut
On lör, 2011-04-23 at 11:37 -0400, Tom Lane wrote: > I wrote: > > * Where they're not, install the locale_t with uselocale(), do > > mbstowcs or wcstombs, and revert to the former locale_t setting. > > This is ugly as sin, and not thread-safe, but of course lots of > > the backend is not thread-saf

Re: [HACKERS] Collation patch's handling of wcstombs/mbstowcs is sheerest fantasy

2011-04-24 Thread Peter Eisentraut
On fre, 2011-04-22 at 16:32 -0400, Tom Lane wrote: > It's possible that things are not too broken in practice, because it's > likely that the transformations done by these functions only depend on > the encoding indicated by LC_CTYPE, and we (try to) enforce that all > locales used in a given datab

Re: [HACKERS] Collation patch's handling of wcstombs/mbstowcs is sheerest fantasy

2011-04-23 Thread Tom Lane
I wrote: > * Where they're not, install the locale_t with uselocale(), do > mbstowcs or wcstombs, and revert to the former locale_t setting. > This is ugly as sin, and not thread-safe, but of course lots of > the backend is not thread-safe. I've been corrected on that: uselocale() *is* thread safe

Re: [HACKERS] Collation patch's handling of wcstombs/mbstowcs is sheerest fantasy

2011-04-22 Thread Tom Lane
I wrote: > I just noticed that the collation patch has modified char2wchar and > wchar2char to accept a collation OID as argument ... but it hasn't done > anything to make those arguments actually work. Since those functions > depend on wcstombs and mbstowcs, which respond to LC_CTYPE and nothing

[HACKERS] Collation patch's handling of wcstombs/mbstowcs is sheerest fantasy

2011-04-22 Thread Tom Lane
I just noticed that the collation patch has modified char2wchar and wchar2char to accept a collation OID as argument ... but it hasn't done anything to make those arguments actually work. Since those functions depend on wcstombs and mbstowcs, which respond to LC_CTYPE and nothing else, this flat o

Re: [HACKERS] Collation

2009-08-11 Thread Alvaro Herrera
David Fetter wrote: > Folks, > > While trying unsuccessfully to help someone in IRC, they pointed me to > this: > > http://www.flexiguided.de/publications.pgcollkey.en.html > > Is anybody working with the kind people of FlexiGuided GmbH to see > about integrating this feature more tightly with P

[HACKERS] Collation

2009-08-11 Thread David Fetter
Folks, While trying unsuccessfully to help someone in IRC, they pointed me to this: http://www.flexiguided.de/publications.pgcollkey.en.html Is anybody working with the kind people of FlexiGuided GmbH to see about integrating this feature more tightly with PostgreSQL? If not, how would we make

Re: [HACKERS] Collation at database level

2008-04-16 Thread Gregory Stark
"Radek Strnad" <[EMAIL PROTECTED]> writes: > The problem with POSIX locales is that you never know what > locales user have got installed. I've discovered that some linux distros > don't even have other than UTF-8 based locales. On Debian you're even deeper in it. The user can configure which lo

[HACKERS] Collation at database level

2008-04-16 Thread Radek Strnad
Hi, I'm working on the bachelor thesis. The goal of the work will be to implement collation at database level based on POSIX locales and make foundations for further national language support development. User will be able to set collation when creating database or change collation of exis

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-24 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > The glibc docs sample code suggests using 2x the original string > length for the initial buffer. My testing showed that *always* > triggered the exceptional case. A bit of experimentation lead to the > 3x+4 which eliminates it except for 0 and 1 byte string

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-24 Thread Tom Lane
Stephan Szabo <[EMAIL PROTECTED]> writes: > On Fri, 22 Aug 2003, Tom Lane wrote: >> I'd go so far as to make it a critical section --- that ensures that any >> ERROR will be turned to FATAL, even if it's in a subroutine you call. > I didn't know we could do that, could be handy, although the comme

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-23 Thread Joe Conway
Greg Stark wrote: Joe Conway <[EMAIL PROTECTED]> writes: if (sigsetjmp(Warn_restart, 1) != 0) { memcpy(&Warn_restart, &save_restart, sizeof(Warn_restart)); newlocale = setlocale(LC_COLLATE, oldlocale); if (!newlocale) elog(PANIC, "setlocale failed to reset locale: %s", localestr);

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-23 Thread Greg Stark
Joe Conway <[EMAIL PROTECTED]> writes: > > if (sigsetjmp(Warn_restart, 1) != 0) > > { > > memcpy(&Warn_restart, &save_restart, sizeof(Warn_restart)); > > newlocale = setlocale(LC_COLLATE, oldlocale); > > if (!newlocale) > > elog(PANIC, "setlocale failed to reset locale: %s",

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-23 Thread Joe Conway
Joe Conway wrote: What about something like this? Oops! Forgot to restrore error handling. See below: Joe 8< #include #include #include "postgres.h" #include "fmgr.h" #include "tcop/tcopprot.h" #include "utils/builtins.h" #define GET_STR(textp) \ DatumGetCStri

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-23 Thread Joe Conway
Greg Stark wrote: Yeah I thought of that. But if making it a critical section is cheap then it's probably a better approach. The problem with restoring the locale for the palloc is that if the user is unlucky he might sort a table of thousands of strings that all trigger the exception case. What ab

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-23 Thread Stephan Szabo
On 23 Aug 2003, Greg Stark wrote: > Stephan Szabo <[EMAIL PROTECTED]> writes: > > > Since most of that work is for an exceptional case, maybe it'd be safer > > (although slower) to structure the function as > > Yeah I thought of that. But if making it a critical section is cheap then it's > probab

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-23 Thread Greg Stark
Stephan Szabo <[EMAIL PROTECTED]> writes: > Since most of that work is for an exceptional case, maybe it'd be safer > (although slower) to structure the function as Yeah I thought of that. But if making it a critical section is cheap then it's probably a better approach. The problem with restorin

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-23 Thread Stephan Szabo
On Fri, 22 Aug 2003, Stephan Szabo wrote: > On Fri, 22 Aug 2003, Tom Lane wrote: > > > Stephan Szabo <[EMAIL PROTECTED]> writes: > > > On 22 Aug 2003, Greg Stark wrote: > > >> If it's deemed a reasonable approach and nobody has any fatal flaws then I > > >> expect it would be useful to put in the

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-22 Thread Stephan Szabo
On Fri, 22 Aug 2003, Tom Lane wrote: > Stephan Szabo <[EMAIL PROTECTED]> writes: > > On 22 Aug 2003, Greg Stark wrote: > >> If it's deemed a reasonable approach and nobody has any fatal flaws then I > >> expect it would be useful to put in the contrib directory? > > > I'm not sure that ERROR if th

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-22 Thread Tom Lane
Stephan Szabo <[EMAIL PROTECTED]> writes: > On 22 Aug 2003, Greg Stark wrote: >> If it's deemed a reasonable approach and nobody has any fatal flaws then I >> expect it would be useful to put in the contrib directory? > I'm not sure that ERROR if the locale cannot be put back is sufficient > (alth

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-22 Thread Stephan Szabo
On 22 Aug 2003, Greg Stark wrote: > > So, I needed a way to sort using collation rules other than the one the > database was built with. So I wrote up the following function exposing strxfrm > with an extra parameter to specify the LC_COLLATE value to use. > > This is my first C function so I'm r

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-22 Thread Peter Eisentraut
Greg Stark writes: > This is my first C function so I'm really unsure that I've done the right > thing. For the most part I pattern-matched off the string_io code in the > contrib directory. That was just about the worst example you could have picked. Please forget everything you have seen and s

Re: [HACKERS] Collation rules and multi-lingual databases

2003-08-22 Thread Greg Stark
So, I needed a way to sort using collation rules other than the one the database was built with. So I wrote up the following function exposing strxfrm with an extra parameter to specify the LC_COLLATE value to use. This is my first C function so I'm really unsure that I've done the right thing. F

[HACKERS] Collation and case mapping thoughts (long)

2002-11-12 Thread Peter Eisentraut
I have been doing some research about how to create new routines for string collation and character case mapping that would allow us to break out of the one-locale-per-process scheme. I have found that the Unicode standard provides a lot of useful specifications and data for this. The Unicode dat

Re: [HACKERS] Collation order for btree-indexable datatypes

2001-05-02 Thread Bruce Momjian
If you feel strongly about it, go ahead. I didn't see any problem reports on it, and it seemed kind of iffy, so I thought we should hold it. > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Comparing NaN/Invalid seems so off the beaten path that we would just > > wait for 7.2. That and no one ha

Re: [HACKERS] Collation order for btree-indexable datatypes

2001-05-02 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > Comparing NaN/Invalid seems so off the beaten path that we would just > wait for 7.2. That and no one has reported a problem with it so far. Do you consider "vacuum analyze" on the regression database to be off the beaten path? How about creating an i

Re: [HACKERS] Collation order for btree-indexable datatypes

2001-05-02 Thread Tom Lane
Stephan Szabo <[EMAIL PROTECTED]> writes: > What parts of the changes would require an initdb, would new > functions need to be added or the index ops need to change or would > it be fixes to the existing functions (if the latter, wouldn't a recompile > and dropping/recreating the indexes be enoug

Re: [HACKERS] Collation order for btree-indexable datatypes

2001-05-02 Thread Stephan Szabo
On Wed, 2 May 2001, Tom Lane wrote: > Stephan Szabo <[EMAIL PROTECTED]> writes: > > What parts of the changes would require an initdb, would new > > functions need to be added or the index ops need to change or would > > it be fixes to the existing functions (if the latter, wouldn't a recompile >

Re: [HACKERS] Collation order for btree-indexable datatypes

2001-05-02 Thread Bruce Momjian
> A closely related problem is that the "current time" special value > supported by several of the date/time datatypes is inherently not > compatible with being indexed, since its sort order relative to > ordinary time values keeps changing. We had discussed removing this > special case, and I th

Re: [HACKERS] Collation order for btree-indexable datatypes

2001-05-02 Thread Stephan Szabo
> I am planning to fix this by ensuring that all these operations agree > on an (arbitrarily chosen) sort order for the "weird" values of these > types. What I'm wondering about is whether to insert the fixes into > 7.1.1 or wait for 7.2. In theory changing the sort order might break > existing

[HACKERS] Collation order for btree-indexable datatypes

2001-05-02 Thread Tom Lane
To avoid getting into states where a btree index is corrupt (or appears that way), it is absolutely critical that the datatype provide a unique, consistent sort order. In particular, the operators = <> < <= > >= had better all agree with each other and with the 3-way-comparison support function a