Re: hash as an search key and hash collision

Andy Fan Fri, 19 Jun 2020 16:34:20 -0700

On Sat, Jun 20, 2020 at 12:34 AM Tomas Vondra <tomas.von...@2ndquadrant.com>
wrote:


> On Fri, Jun 19, 2020 at 04:24:01PM +0800, Andy Fan wrote:
> >I want to maintain an internal table which the primary key is sql_text and
> >planstmt::text, it is efficient since it both may be very long.  So a
> >general
> >idea is to use sql_hash_value and plan_hash_value. Then we have to
> >handle the hash collision case.  However I checked the codes both in
> >sr_plans[1]
> >and pg_stat_statements[2],  both of them didn't handle such cases, IIUC.
> so
> >how can I understand this situation?
> >
>
> IIRC pg_stat_statements simply accepts the hash collision risk. This is
> what the docs say:
>
>      In some cases, queries with visibly different texts might get merged
>      into a single pg_stat_statements entry. Normally this will happen
>      only for semantically equivalent queries, but there is a small
>      chance of hash collisions causing unrelated queries to be merged
>      into one entry. (This cannot happen for queries belonging to
>      different users or databases, however.)
>
> The consequences of a hash collision are relatively harmless, enough to
> make it not worth the extra checks (e.g. because the SQL text may not be
> available in memory and would need to be read from the file).
>

I see.  Thank you for this information,  this does make sense.

I suppose sr_plan does the same thing, but I haven't checked.
>

sr_plans is used to map a sql hash value to a PlannedStmts,  if hash
collisions
happen,  it may execute a query B while the user wants to execute Query A.
this should be more sensitive than pg_stat_statements which doesn't require
exact data.   I added Ildus who is the author of sr_plan to the cc list
in case he wants to take a look.

-- 
Best Regards
Andy Fan

Re: hash as an search key and hash collision

Reply via email to