Daniel, thanks for the reply. I believe we will just remove the collation, allow LIKE to function normally, and wait for a future patch is one is ever provided.
On Thu, Oct 24, 2019 at 3:39 PM Daniel Verite <dan...@manitou-mail.org> wrote: > Jeff Lanzarotta wrote: > > > I have a question about nondeterministic collations in PostgreSQL 12. I > > have created a new collation that is nondeterministic and created several > > columns which use this collation. Querying these columns works great > until > > I use LIKE. When I do, I get the following error: > > > > SQL Error [0A000]: ERROR: nondeterministic collations are not supported > for > > LIKE > > > > Is there any plan to allow this functionality? > > PostgreSQL development is conducted without a roadmap [1]. Maybe > someone will submit a patch to enable LIKE with nondeterministic > collations, but so far it did not happen according to the current set > of patches at https://commitfest.postgresql.org > > Such matches can be weirder than you might think (not to > mention much slower). > Consider for instance a collation that ignores punctuation: > > CREATE COLLATION "nd3alt" ( > provider = 'icu', > locale='und@colAlternate=shifted', > deterministic = false > ); > > In the icu_ext extension, icu_strpos [2] can match a substring with > a nondeterministic collation, which is one part of what LIKE > would need to do for such collations. The implementation uses > the string search facility of the ICU library. > > With the above-defined collation, we can have for instance: > > SELECT icu_strpos('abc. ...de', 'c,d' COLLATE nd3alt); > icu_strpos > ------------ > 3 > > So even though 'c,d' is not a substring of 'abc. ...de' in the common > sense, it is recognized as such by this collation, by design. > > A LIKE operator for nondeterministic collations should be able to > recognize this too, but with an arbitrary number of substrings to > match in the pattern, plus it should handle the underscore wildcard > in a way that hopefully makes sense. > > With the example above, > 'abc. ...de' LIKE '%c,d%' COLLATE nd3alt > should certainly be a match, but in the case of this variant: > 'abc. ...de' LIKE '%c_d%' COLLATE nd3alt > it's not necessarily clear how (or even if) it should work. > > > [1] https://www.postgresql.org/developer/roadmap/ > [2] https://github.com/dverite/icu_ext#icu_strpos > > > Best regards, > -- > Daniel Vérité > PostgreSQL-powered mailer: http://www.manitou-mail.org > Twitter: @DanielVerite >