On Sat, Aug 10, 2019 at 3:27 AM Jim Finnerty <jfinn...@amazon.com> wrote: > > I missed this thread. I'd be happy to post the code for what we use as the > stable query identifier, but we could definitely come up with a more > efficient algorithm if we're willing to assume that the sql statements are > the same if and only if the parse tree structure is the same. > > Currently what we do for the sql hash is to simply replace all the literals > and then hash the resulting SQL string
Isn't that what pg_store_plan is already doing? Except that it removes extraneous whitespaces and put identifiers in uppercase so that you get a reasonable query identifier. > you could define a stable identifier for each node type, ignore literal > constants, and hash fully-qualified object names instead of OIDs. That > should be pretty fast. This has been discussed already, and resolving all object names and qualifier names will add a dramatic overhead for many workloads.