On Sun, Oct 06, 2024 at 03:32:02PM +0200, Guillaume Lelarge wrote: > I'm not sure I follow. That would mean that every time a query is executed, > it always gets the same amount of workers. Which is not guaranteed to be > true. > > I would agree, though, that parallelized_queries_launched is probably not > that interesting. I could get rid of it if you think it should go away.
My point is that these stats are useful to know which action may have to be taken when reaching a mean, and numbers in pg_stat_statements offer hints that something is going wrong and that a closer lookup at an EXPLAIN plan may be required, particularly if the total number of workers planned and launched aggregated in the counters is unbalanced across queries. If the planned/launched ratio is balanced across most queries queries, a GUC adjustment may be OK. If the ratio is very unbalanced in a lower set of queries, I'd also look at tweaking GUCs instead like the per_gather. These counters give information that one or the other may be required. > Well, I don't see this as an overlap. Rather more information. Later versions of Benoit's patch have been accumulating this data in the executor state. v4 posted at [1] has the following diffs: --- a/src/include/nodes/execnodes.h +++ b/src/include/nodes/execnodes.h @@ -724,6 +724,9 @@ typedef struct EState */ List *es_insert_pending_result_relations; List *es_insert_pending_modifytables; + + int es_workers_launched; + int es_workers_planned; } EState; Your v2 posted on this thread has that: @@ -707,6 +707,12 @@ typedef struct EState struct EPQState *es_epq_active; bool es_use_parallel_mode; /* can we use parallel workers? */ + bool es_used_parallel_mode; /* was executed in parallel */ + int es_parallelized_workers_launched; + int es_parallelized_workers_planned; + int es_parallelized_nodes; /* # of parallelized nodes */ + int es_parallelized_nodes_all_workers; /* # of nodes with all workers launched */ + int es_parallelized_nodes_no_worker; /* # of nodes with no workers launched */ es_parallelized_workers_launched and es_workers_launched are the same thing in both. > On this, I would agree with you. They are not that particularly useful to > get better setting for parallel GUCs. I can drop them if you want. Yep. I would remove them for now. This leads to more bloat. > Did this on the v2 version of the patch (attached here). > > Thanks for your review. If you want the parallelized_queries_launched > column and the parallelized_nodes_* columns dropped, I can do that on a v3 > patch. I'd recommend to split that into more independent patches: - Introduce the two counters in EState with the incrementations done in nodeGatherMerge.c and nodeGather.c (mentioned that at [2], you may want to coordinate with Benoit to avoid duplicating the work). - Expand pg_stat_statements to use them for DMLs, SELECTs, well where they matter. - Look at expanding that for utilities that can do parallel jobs: CREATE INDEX and VACUUM, but this has lower priority to me, and this can reuse the same counters as the ones added by patch 2. [1]: https://www.postgresql.org/message-id/6ecad3ad-835c-486c-9ebd-da87a9a97...@dalibo.com [2]: https://www.postgresql.org/message-id/zv46wtmjltuu2...@paquier.xyz -- Michael
signature.asc
Description: PGP signature