Re: compute_query_id and pg_stat_statements

Julien Rouhaud Tue, 11 May 2021 23:05:16 -0700

On Wed, May 12, 2021 at 02:33:35PM +0900, Kyotaro Horiguchi wrote:
> At Wed, 12 May 2021 10:42:01 +0800, Julien Rouhaud <[email protected]> wrote 
> in 
> > 
> > I don't think that this approach would cope well for people who want a 
> > queryid
> > without pg_stat_statements or such.  Since the queryid can now be found in
> > pg_stat_activity, EXPLAIN output or the logs I think it's entirely 
> > reasonable
> > to allow users to benefit from that even if they don't install additional
> > module.
> 
> Ah, I missed that case.  And we are wanting to use pg_stat_statements
> with (almost) zero-config?  How about the following behavior?
> 
> Setting query_id_provider to 'none' means we don't calculate query-id
> by default. However, if queryIdWante() is called, the default provider
> is set up and starts calculating query id.


Having "none" meant "not unless someone asks for it" looks like a POLA
violation.

> Setting query_id_provider to something else means the user wants
> query-id calcualted using the provider. Setting 'default' is
> equivalent to setting compute_query_id to 'on'.
> 
> There might be a case where a user sets query_id_provider to
> non-'none' but don't want have query-id calculated, but it can be said
> a kind of mis-configuration?

So if I'm understanding correctly, you're arguing for an approach different to
what Michael stated as the general consensus in [1].  I'm not saying that I
think it's a bad idea (and I actually suggested it before), but we have to
chose an approach and stick with it.

> > I think this would be a mistake to do that, as it would mean that we don't
> > officially support alternative queryid provider.
> 
> Ok, if we want to support alternative providers from the first, we
> need to actually write the loader code for query-id providers.  It
> would not be so hard?, but it might not be suitable to this stage so I
> proposed that to get rid of needing such complexity for now.

I did write a POC extension [2] to demonstrate that moving pg_stat_statement's
queryid calculation in core doesn't mean that we're imposing it to everyone.
And yes this is critical and a must have in the initial implementation.

> (Anyway I prefer to load query-id provider as a dynamically loadable
>  module rather than hook-function.)

I agree that having a specific API (I'm fine with a hook or a dynamically
loaded function) for that would be better, but it doesn't appear to be the
opinion of the majority.

> > The GUC itself may not change, but third-party queryid provider would 
> > probably
> > need changes as the new entry point will be dedicated to compute a queryid
> > only, while third-party plugins may do more than that in their
> > post_parse_analyze_hook.  And also users will have to change their
> 
> I don't think it is not that a problem.

Did you mean "I don't think that it's a problem"?  Otherwise I don't get it.

> Even if any third-party
> extension is having query-id generator by itself, in most cases it
> would be a copy of JumbleQuery in case of pg_stat_statement is not
> loaded and now it is moved in-core as 'default' provider. What the
> exntension needs to be done is just ripping out the copied generator
> code.  I guess...

I don't fully understand, but it seems that you're arguing that the only use
case is to have something similar to pg_stat_statements (say e.g.
pg_store_plans), that always have the same queryid implementation as
pg_stat_statements.  That's not the case, as there already are "clones" of
pg_stat_statements, and the main difference is an alternative queryid
implementation.  So in that case what author would do is to drop everything
*except* the queryid implementation.

And if I'm not mistaken, pg_store_plans also wants a different queryid
implementation, but has to handle a secondary queryid on top of it
(https://github.com/ossc-db/pg_store_plans/blob/master/pg_store_plans.c#L843-L855).
So here again what the extension want is to get rid of pg_stat_statements (and
now core) queryid implementation.

> > configuration to use that new interface, and additionally the module may now
> > have to be removed from shared_preload_libraries.  Overall, it doesn't seem 
> > to
> > me that it would make users' life easier.
> 
> Why the third-party module need to be removed from
> shared_preload_libraries?  The module can stay as a preloaded shared
> library but just no longer need to have its own query-id provider
> since it is provided in-core.  If the extension required a specific
> provider, the developer need to make it a loadable module and users
> need to specify the provider module explicitly.

It's the same misunderstanding here.  Basically people want to benefit from the
whole ecosystem based on a queryid (pg_stat_statements, now
pg_stat_activity.query_id and such) but with another definition of what a
queryid is.  So those people will now only need to implement something like
[2], rather than forking every single extension they want to use.


[1]: https://www.postgresql.org/message-id/[email protected]
[2]: https://github.com/rjuju/pg_queryid

Re: compute_query_id and pg_stat_statements

Reply via email to