On Tue, Mar 2, 2021, at 21:40, Mark Dilger wrote: > > > > On Mar 2, 2021, at 12:04 PM, Joel Jacobson <j...@compiler.org> wrote: > > > > On Tue, Mar 2, 2021, at 20:57, Mark Dilger wrote: > >> I didn't phrase that clearly enough. I'm thinking about whether you > >> include the bounds information in the hash function. The current > >> implementation of hash_range(PG_FUNCTION_ARGS) is going to hash the lower > >> and upper bounds, since you didn't change it to do otherwise, so "equal" > >> values won't always hash the same. I haven't tested this out, but it > >> seems you could get a different set of rows depending on whether the > >> planner selects a hash join. > > > > I think this issue is solved by the > > empty-ranges-with-bounds-information-v2.patch I just sent, > > since with it, there are no semantic changes at all, lower() and upper() > > works like before. > > There are semantic differences, because hash_range() doesn't call lower() and > upper(), it uses RANGE_HAS_LBOUND and RANGE_HAS_UBOUND, the behavior of which > you have changed. I created a regression test and expected results and > checked after applying your patch, and your patch breaks the hash function > behavior. Notice that before your patch, all three ranges hashed to the same > value, but not so after: > > > @@ -1,18 +1,18 @@ > select hash_range('[a,a)'::textrange); > hash_range > ------------ > - 484847245 > + -590102690 > (1 row) > > select hash_range('[b,b)'::textrange); > hash_range > ------------ > - 484847245 > + 281562732 > (1 row) > > select hash_range('[c,c)'::textrange); > - hash_range > ------------- > - 484847245 > + hash_range > +------------- > + -1887445565 > (1 row) > > > You might change how hash_range() works to get all "equal" values to hash the > same value, but that just gets back to the problem that non-equal things > appear to be equal. I guess that's your two-warty-feet preference, but not > everyone is going to be in agreement on that.
Yikes. Here be dragons. I think I want my wart free foot back please. Many thanks for explaining. I think I’ll abandon this patch. I guess implementing an entirely new range type could be an acceptable solution, but that’s too big of a project for me to manage on my own. If any more experienced hackers are interested in such a project, I would love to help if I can. > > — > Mark Dilger > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > > > > Kind regards, Joel