> On Mar 2, 2021, at 11:42 AM, Mark Dilger <mark.dil...@enterprisedb.com> wrote:
>
>
>
>> On Mar 2, 2021, at 11:34 AM, Joel Jacobson <j...@compiler.org> wrote:
>>
>> Yes. It's random, since equality isn't changed, the sort operation cannot
>> tell the difference, and nor could a user who isn't aware of upper() /
>> lower() could reveal differences.
>
> This sounds unworkable even just in light of the original motivation for this
> whole thread. If I use your proposed regexp_positions(string text, pattern
> text, flags text) function to parse a large number of "positions" from a
> document, store all those positions in a table, and do a join of those
> positions against something else, it's not going to work. Positions will
> randomly vanish from the results of that join, which is going to be really
> surprising. I'm sure there are other examples of Tom's general point about
> compares-equal-but-not-equal datatypes.
I didn't phrase that clearly enough. I'm thinking about whether you include
the bounds information in the hash function. The current implementation of
hash_range(PG_FUNCTION_ARGS) is going to hash the lower and upper bounds, since
you didn't change it to do otherwise, so "equal" values won't always hash the
same. I haven't tested this out, but it seems you could get a different set of
rows depending on whether the planner selects a hash join.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company