[D] Give UNION ALL more opportunities for parallel plans in MPP. [cloudberry]

via GitHub Wed, 06 Aug 2025 07:01:50 -0700


GitHub user avamingli created a discussion: Give UNION ALL more opportunities 
for parallel plans in MPP.


### Description


In CBDB's PostgreSQL implementation, UNION ALL operations currently have four 
processing methods: 
three parallel variants (Parallel Append with partial subpaths, Parallel Append 
with mixed partial/non-partial subpaths, and Append with partial subpaths) and 
one non-parallel (Append with non-partial subpaths). 

1. Parallel Append with partial subpaths

```sql
SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1
UNION ALL
SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1;
                     QUERY PLAN                     
----------------------------------------------------
 Gather
   Workers Planned: 4
   ->  Parallel Append
         ->  Parallel Seq Scan on tenk1
               Filter: (fivethous = (tenthous + 1))
         ->  Parallel Seq Scan on tenk1 tenk1_1
               Filter: (fivethous = (tenthous + 1))
(7 rows)
```
2. Parallel Append with mixed partial/non-partial subpaths

```sql
explain (costs off)
  select round(avg(aa)), sum(aa) from a_star;
                          QUERY PLAN                          
--------------------------------------------------------------
 Finalize Aggregate
   ->  Gather
         Workers Planned: 3
         ->  Partial Aggregate
               ->  Parallel Append
                     ->  Seq Scan on d_star a_star_4
                     ->  Seq Scan on c_star a_star_3
                     ->  Parallel Seq Scan on f_star a_star_6
                     ->  Parallel Seq Scan on e_star a_star_5
                     ->  Parallel Seq Scan on b_star a_star_2
                     ->  Parallel Seq Scan on a_star a_star_1
```

3. Append with partial subpaths
```sql
explain (costs off)
  select round(avg(aa)), sum(aa) from a_star;
                          QUERY PLAN                          
--------------------------------------------------------------
 Finalize Aggregate
   ->  Gather
         Workers Planned: 1
         ->  Partial Aggregate
               ->  Append
                     ->  Parallel Seq Scan on a_star a_star_1
                     ->  Parallel Seq Scan on b_star a_star_2
                     ->  Parallel Seq Scan on c_star a_star_3
                     ->  Parallel Seq Scan on d_star a_star_4
                     ->  Parallel Seq Scan on e_star a_star_5
                     ->  Parallel Seq Scan on f_star a_star_6
(11 rows)
```

4. Append with non-partial subpaths

```sql
select * from
  (select *, 0 as x from int8_tbl a
   union all
   select *, 1 as x from int8_tbl b) ss
where (x = 0) or (q1 >= q2 and q1 <= q2);
                 QUERY PLAN                  
---------------------------------------------
 Append
   ->  Seq Scan on int8_tbl a
   ->  Seq Scan on int8_tbl b
         Filter: ((q1 >= q2) AND (q1 <= q2))
(4 rows)
```
While these work well for local queries, we've had to disable parallel 
execution when Motion nodes appear in subpaths due to a critical correctness 
issue. The problem occurs when Parallel Append workers mark subnodes as 
completed, potentially causing premature skipping of Motion-containing 
branches. 

This limitation forces serial execution for most distributed table queries, as 
shown in example plans where Gather Motion wraps a serial Append with 
Redistribute Motion nodes in each branch, missing significant optimization 
opportunities.

```sql
    explain(costs off) select b, count(*) from t1 group by b union all
    select b, count(*) from t2 group by b;
                                QUERY PLAN
    ------------------------------------------------------------------
     Gather Motion 3:1  (slice1; segments: 3)
       ->  Append
             ->  HashAggregate
                   Group Key: t1.b
                   ->  Redistribute Motion 3:3  (slice2; segments: 3)
                         Hash Key: t1.b
                         ->  Seq Scan on t1
             ->  HashAggregate
                   Group Key: t2.b
                   ->  Redistribute Motion 3:3  (slice3; segments: 3)
                         Hash Key: t2.b
                         ->  Seq Scan on t2
     Optimizer: Postgres query optimizer
    (13 rows)
```

But, there is still a chance to be parallel.
I propose a robust solution that first attempts parallel-aware Append when safe 
(no Motion nodes), then automatically falls back to parallel-oblivious Append 
when Motion hazards are detected. This works because regular Append reliably 
executes all subnodes regardless of Motion presence, while CBDB's Motion nodes 
inherently handle tuples individually without requiring worker coordination. 

The benefits extend beyond the UNION ALL operator itself - enabling this 
parallelization creates ripple effects where subpaths gain more parallel 
execution opportunities, particularly valuable for complex nested queries. This 
optimization stands to significantly improve TPC-DS benchmark performance and 
other analytical workloads involving distributed tables.

We're preparing to implement this with prototype development, TPC-DS testing, 
and edge case validation. Community feedback is welcome on potential corner 
cases, benchmarking approaches, and real-world query patterns that might 
benefit most.


### Use case/motivation

_No response_

### Related issues

_No response_

### Are you willing to submit a PR?

- [X] Yes I am willing to submit a PR!

GitHub link: https://github.com/apache/cloudberry/discussions/1291

----
This is an automatically sent email for dev@cloudberry.apache.org.
To unsubscribe, please send an email to: dev-unsubscr...@cloudberry.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org
For additional commands, e-mail: dev-h...@cloudberry.apache.org

[D] Give UNION ALL more opportunities for parallel plans in MPP. [cloudberry]

Reply via email to