Hi Seamus,

On Mon, Feb 15, 2021 at 5:28 PM Seamus Abshere <sea...@abshere.net> wrote:
> It turns out parallel_workers may be a useful reloption for certain uses of 
> partitioned tables, at least if they're made up of fancy column store 
> partitions (see 
> https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com).
>
> Would somebody tell me what I'm doing wrong? I would love to submit a patch 
> but I'm stuck:
>
> diff --git a/src/backend/optimizer/path/allpaths.c 
> b/src/backend/optimizer/path/allpaths.c
> index cd3fdd259c..f1ade035ac 100644
> --- a/src/backend/optimizer/path/allpaths.c
> +++ b/src/backend/optimizer/path/allpaths.c
> @@ -3751,6 +3751,7 @@ compute_parallel_worker(RelOptInfo *rel, double 
> heap_pages, double index_pages,
>          * If the user has set the parallel_workers reloption, use that; 
> otherwise
>          * select a default number of workers.
>          */
> +       // I want to affect this
>         if (rel->rel_parallel_workers != -1)
>                 parallel_workers = rel->rel_parallel_workers;
>         else
>
> so I do this
>
> diff --git a/src/backend/access/common/reloptions.c 
> b/src/backend/access/common/reloptions.c
> index c687d3ee9e..597b209bfb 100644
> --- a/src/backend/access/common/reloptions.c
> +++ b/src/backend/access/common/reloptions.c
> @@ -1961,13 +1961,15 @@ build_local_reloptions(local_relopts *relopts, Datum 
> options, bool validate)
>  bytea *
>  partitioned_table_reloptions(Datum reloptions, bool validate)
>  {
> -       /*
> -        * There are no options for partitioned tables yet, but this is able 
> to do
> -        * some validation.
> -        */
> +       static const relopt_parse_elt tab[] = {
> +               {"parallel_workers", RELOPT_TYPE_INT,
> +               offsetof(StdRdOptions, parallel_workers)},
> +       };
> +
>         return (bytea *) build_reloptions(reloptions, validate,
>                                                                           
> RELOPT_KIND_PARTITIONED,
> -                                                                         0, 
> NULL, 0);
> +                                                                         
> sizeof(StdRdOptions),
> +                                                                         
> tab, lengthof(tab));
>  }
>
> That "works":
>
> postgres=# alter table test_3pd_cstore_partitioned set (parallel_workers = 
> 33);
> ALTER TABLE
> postgres=# select relname, relkind, reloptions from pg_class where relname = 
> 'test_3pd_cstore_partitioned';
>            relname           | relkind |      reloptions
> -----------------------------+---------+-----------------------
>  test_3pd_cstore_partitioned | p       | {parallel_workers=33}
> (1 row)
>
> But it seems to be ignored:
>
> diff --git a/src/backend/optimizer/path/allpaths.c 
> b/src/backend/optimizer/path/allpaths.c
> index cd3fdd259c..c68835ce38 100644
> --- a/src/backend/optimizer/path/allpaths.c
> +++ b/src/backend/optimizer/path/allpaths.c
> @@ -3751,6 +3751,8 @@ compute_parallel_worker(RelOptInfo *rel, double 
> heap_pages, double index_pages,
>          * If the user has set the parallel_workers reloption, use that; 
> otherwise
>          * select a default number of workers.
>          */
> +       // I want to affect this, but this assertion always passes
> +       Assert(rel->rel_parallel_workers == -1)
>         if (rel->rel_parallel_workers != -1)
>                 parallel_workers = rel->rel_parallel_workers;
>         else

You may see by inspecting the callers of compute_parallel_worker()
that it never gets called on a partitioned table, only its leaf
partitions.  Maybe you could try calling compute_parallel_worker()
somewhere in add_paths_to_append_rel(), which has this code to figure
out parallel_workers to use for a parallel Append path for a given
partitioned table:

        /* Find the highest number of workers requested for any subpath. */
        foreach(lc, partial_subpaths)
        {
            Path       *path = lfirst(lc);

            parallel_workers = Max(parallel_workers, path->parallel_workers);
        }
        Assert(parallel_workers > 0);

        /*
         * If the use of parallel append is permitted, always request at least
         * log2(# of children) workers.  We assume it can be useful to have
         * extra workers in this case because they will be spread out across
         * the children.  The precise formula is just a guess, but we don't
         * want to end up with a radically different answer for a table with N
         * partitions vs. an unpartitioned table with the same data, so the
         * use of some kind of log-scaling here seems to make some sense.
         */
        if (enable_parallel_append)
        {
            parallel_workers = Max(parallel_workers,
                                   fls(list_length(live_childrels)));
            parallel_workers = Min(parallel_workers,
                                   max_parallel_workers_per_gather);
        }
        Assert(parallel_workers > 0);

        /* Generate a partial append path. */
        appendpath = create_append_path(root, rel, NIL, partial_subpaths,
                                        NIL, NULL, parallel_workers,
                                        enable_parallel_append,
                                        -1);

Note that the 'rel' in this code refers to the partitioned table for
which an Append path is being considered, so compute_parallel_worker()
using that 'rel' would use the partitioned table's
rel_parallel_workers as you are trying to do.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


Reply via email to