On 14/2/2025 08:21, Michael Paquier wrote:
On Thu, Feb 13, 2025 at 11:10:27AM -0600, Sami Imseih wrote:
I don't think direct setting of values is a good idea. We will need an API
similar to pgstat_report_query_id which ensures we are only reporting top
level planIds -and- in the case of multiple extensions with the capability
to set a planId, only the first one in the stack wins. pgstat_report_query_id
does allow for forcing a queryId ( force flag which is false by default ), and
I think this new API should allow the same.
I'm obviously siding with this proposal because we have an ask to
track this kind of data, and because this kind of data is kind of hard
to track across a stack of extensions through the core backend code.

Point is, would others be interested in this addition or just object
to it because it touches the core code?
I have already implemented it twice in different ways as a core patch.
In my projects, we need to track queryId and plan node ID for two reasons:
1. Optimisational decisions made during transformation/path generation stages up to the end of execution to correct them in the future. 2. Cache information about the query tree/node state to use it for statistical purposes. In my experience, we don't need a single plan_id field; we just need an 'extended list' pointer at the end of the Plan, PlannedStmt, Query, and RelOptInfo structures and a hook at the end of the create_plan_recurse() to allow passing some info from the path generator to the plan tree. An extension may add its data to the list (we may register an extensible node type to be sure we don't interfere with other extensions) and manipulate it in a custom way and with custom UI.
Generally, it makes the optimiser internals more open to extensions.

--
regards, Andrei Lepikhov


Reply via email to