On Sat, Dec 29, 2012 at 7:16 PM, Daniel Farina <dan...@heroku.com> wrote: > On Sat, Dec 29, 2012 at 7:12 PM, Peter Geoghegan <pe...@2ndquadrant.com> > wrote: >> On 30 December 2012 02:45, Daniel Farina <dan...@heroku.com> wrote: >>> As I recall, the gist of this objection had to do with a false sense >>> of stability of the hash value, and the desire to enforce the ability >>> to alter it. Here's an option: xor the hash value with the >>> 'statistics session id', so it's *known* to be unstable between >>> sessions. That gets you continuity in the common case and sound >>> deprecation in the less-common cases (crashes, format upgrades, stat >>> resetting). >> >> Hmm. I like the idea, but a concern there would be that you'd >> introduce additional scope for collisions in the third-party utility >> building time-series data from snapshots. I currently put the >> probability of a collision within pg_stat_statements as 1% in the >> event of a pg_stat_statements.max of 10,000. > > We can use a longer session key and duplicate the queryid (effectively > padding) a couple of times to complete the XOR. I think that makes > the cases of collisions introduced by this astronomically low, as an > increase over the base collision rate.
A version implementing that is attached, except I generate an additional 64-bit session not exposed to the client to prevent even casual de-leaking of the session state. That may seem absurd, until someone writes a tool that de-xors things and relies on it and then nobody feels inclined to break it. It also keeps the public session number short. I also opted to save the underestimate since I'm adding a handful of fixed width fields to the file format anyway. -- fdr
pg_stat_statements-identification-v3.patch.gz
Description: GNU Zip compressed data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers