On Thu, 30 Sept 2021 at 11:54, Tom Lane <t...@sss.pgh.pa.us> wrote: > > David Rowley <dgrowle...@gmail.com> writes: > > This allows us to memoize any join expression, not just equality > > expressions. > > I am clearly failing to get through to you. Do I need to build > an example?
Taking your -0.0 / +0.0 float example, If I understand correctly due to -0.0 and +0.0 hashing to the same value and being classed as equal, we're really only guaranteed to get the same rows if the join condition uses the float value (in this example) directly. If for example there was something like a function we could pass that float value through that was able to distinguish -0.0 from +0.0, then that could cause issues as the return value of that function could be anything and have completely different join partners on the other side of the join. A function something like: create or replace function text_sign(f float) returns text as $$ begin if f::text like '-%' then return 'neg'; else return 'pos'; end if; end; $$ language plpgsql; would be enough to do it. If the join condition was text_sign(t1.f) = t2.col and the cache key was t1.f rather than text_sign(t1.f) On Thu, 30 Sept 2021 at 10:54, Tom Lane <t...@sss.pgh.pa.us> wrote: > So I'm now thinking you weren't that far wrong to be looking at > hashability of the top-level qual operator. What is missing is > that you have to restrict candidate cache keys to be the *direct* > arguments of such an operator. Looking any further down in the > expression introduces untenable assumptions. I think I also follow you now with the above. The problem is that if the join operator is able to distinguish something that the equality operator and hash function are not then we have the same problem. Restricting the join operator to hash equality ensures that the join condition cannot distinguish anything that we cannot distinguish in the cache hash table. David