> On Mar 2, 2021, at 11:42 AM, Mark Dilger <mark.dil...@enterprisedb.com> wrote:
> 
> 
> 
>> On Mar 2, 2021, at 11:34 AM, Joel Jacobson <j...@compiler.org> wrote:
>> 
>> Yes. It's random, since equality isn't changed, the sort operation cannot 
>> tell the difference, and nor could a user who isn't aware of upper() / 
>> lower() could reveal differences.
> 
> This sounds unworkable even just in light of the original motivation for this 
> whole thread.  If I use your proposed regexp_positions(string text, pattern 
> text, flags text) function to parse a large number of "positions" from a 
> document, store all those positions in a table, and do a join of those 
> positions against something else, it's not going to work.  Positions will 
> randomly vanish from the results of that join, which is going to be really 
> surprising.  I'm sure there are other examples of Tom's general point about 
> compares-equal-but-not-equal datatypes.

I didn't phrase that clearly enough.  I'm thinking about whether you include 
the bounds information in the hash function.  The current implementation of 
hash_range(PG_FUNCTION_ARGS) is going to hash the lower and upper bounds, since 
you didn't change it to do otherwise, so "equal" values won't always hash the 
same.  I haven't tested this out, but it seems you could get a different set of 
rows depending on whether the planner selects a hash join.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Reply via email to