On 09/27/2017 05:57 PM, Tom Lane wrote:
If we follow GIST's lead that the leading column is "most important",
the idea could be to require a search constraint on the first column,
which produces the hash that determines the bucket assignment. Hashes
for additional columns would just be payload data in the index entry.
If you have search constraint(s) on low-order column(s), you can check
for hash matches before visiting the heap, but they don't reduce how
much of the index you have to search. Even btree works that way for
many combinations of incomplete index constraints.
I feel that this would eliminate a large amount of potential gains from
such an index. This would be usable only when a sufficiently variable
column exists, in which case a simple hash index on the column wouldn't
be much worse.
But I have an idea. What if there was a requirement for the search
criteria to use tuple equality comparison:
where (a,b,c)=(?,?,?)
or
where (a,b,c) in ((?,?,?),(?,?,?),(?,?,?),(?,?,?))
Wouldn't it eliminate problems with disappearing conditions?
Actually maybe this could be implemented just like a functional index.
So it would implement reasonably something that otherwise would be a
terribly hackish and slow solution like:
create or replace function hashhack(a bytea, b bytea)
returns bigint
language sql
immutable
as $$
-- uses 'x1e' (record separator)
-- to ensure hashhack('a','')!=hashhack('','a')
select (
'x'
||
substr(
md5($1||'\x1e'::bytea||$2),
1,
16
)
)::bit(64)::bigint;
$$;
create index t_hashhack_a_b_idx
on t( hashhack(a::bytea,b::bytea) );
select * from t
where a='a' and b='b'
and
hashhack(a::bytea, b::bytea)
=
hashhack('a'::bytea,'b'::bytea);
If if was automatic man could avoid the overhead of converting data to
bytea/string, concatenating, truncating, converting back to bigint,
rechecking condition etc. that make this kind of hack not very sane.
Even providing a specially crafted function or operator for queries
specifically targeted for the index would be quite sufficient:
where pg_equal( (a,b,c), (?,?,?) );
--
Tomasz "Tometzky" Ostrowski
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers