Hi ne 9. 12. 2018 v 18:50 odesÃlatel Tom Lane <t...@sss.pgh.pa.us> napsal:
> I've been experimenting with the task proposed in [1] of expanding > the text_ops operator family to include type "name" as well as > cross-type text vs. name operators. These operators would need to > offer collation-aware sorting, since that's exactly the difference > between text_ops and the non-collation-aware name_ops opfamily. > I ran into a nasty stumbling block almost immediately: the proposed > name vs. name comparison operators fail, because the parser sees > that both inputs are of noncollatable types so it doesn't assign > any collation to the operator node. > > I experimented with leaving out the name vs. name operators and > just adding cross-type text vs. name and name vs. text operators. > That turns out not to work well at all. Aside from the fact that > opr_sanity whines about an incomplete operator family, I found > various situations where the planner fails, complaining about > things like "missing operator 1(19,19) in opfamily 1994". The > root of that mess seems to be that we've supposed that if an > equality operator is marked mergejoinable then it is mergejoinable > in every opfamily that it's a member of. But that isn't true in > an opfamily structure like this. For instance "text = name" should > be mergejoinable in the name_ops opclass, since we know how to sort > both text and name in non-collation-aware ways. But it's not > mergejoinable in the text_ops opclass if text_ops doesn't provide > collation-aware name vs. name operators to sort the name input with. > > We could probably fix that, at the cost of about tripling the work > needed to detect whether an operator is really mergejoinable, but > I have little confidence that there aren't more problems lurking > behind it. There are a lot of aspects of EquivalenceClass processing > that look pretty questionable if we're trying to support operators > that act this way. For instance, if we derive "a = c" given "a = b" > and "b = c", the equality operator in "a = c" might be mergejoinable > in a different set of opclasses than the other two operators are, > making it debatable whether it can be thought to belong to the same > EquivalenceClass at all. > > So the other approach I'm contemplating is to mark type name as > collatable (with "C" as its typcollation, probably). There are > two plausible sub-approaches: > > 1. The regular name comparison operators remain non-collation-aware. > This would be the least invasive way but it'd have the odd side-effect > that expressions like "namecoll1 < namecoll2 COLLATE something" > would be accepted but the collation would be ignored. Also, we'd > have to invent some new names for the collation-aware name-vs-name > operators, and I don't see any obvious candidate for that. > > 2. Upgrade the name comparison operators to be collation-aware, > with (probably) all the same optimizations for C collation as we > have for text. This'd be a cleaner end result but it seems like > there are a lot of potential side-effects, e.g. syscache lookups > would have to be prepared to pass the right collation argument > to name comparisons. > > I feel like #2 is probably really the Right Thing, but it's also > sounding like significantly more work than I thought this was going > to involve. Not sure if it's worth the effort right now. > > Also, I think that either solution would lead to some subtle changes > in semantics. For example, right now if you compare a name column > to a text value, you get a text (collation-aware) comparison using > the database's default collation. It looks like if name columns > are marked with attcollation = 'C', that would win and the comparison > would now have 'C' collation unless you explicitly override it with > a COLLATE clause. I'm not sure this is a bad thing --- it'd be more > likely to match the sort order of the index on the column --- but it > could surprise people. > The sort of table's names is not too common operation. I don't see a C collate for names as any risk. Regards Pavel > > Thoughts? > > regards, tom lane > > [1] https://www.postgresql.org/message-id/5978.1544030...@sss.pgh.pa.us > >