Re: [D] [Ideas] Optimize SQL: select distinct count(a) from t1; [cloudberry]

via GitHub Tue, 03 Dec 2024 19:31:48 -0800


GitHub user avamingli closed a discussion: [Ideas] Optimize SQL: select 
distinct count(a) from t1;


### Description

User has a auto-generated SQL by 3rd-party like:

```sql
select DISTINCT count(distinct a) form t1;
```

However, for such query:  Aggregate SQL without Group by, there is one row 
returned at most.
The first DISTINCT could be removed in theory, then the Unique and Sort(May be 
other nodes due to planner) on Finalize Aggregate could be avoided.
```sql
explain(costs off) select distinct count(distinct a) from t1;
                         QUERY PLAN
------------------------------------------------------------
 Unique
   Group Key: (count(DISTINCT a))
   ->  Sort
         Sort Key: (count(DISTINCT a))
         ->  Finalize Aggregate
               ->  Gather Motion 3:1  (slice1; segments: 3)
                     ->  Partial Aggregate
                           ->  Seq Scan on t1
 Optimizer: Postgres query optimizer
(9 rows)
```
Currently, Postgres planner doesn't have that feature.

Shall we do it in CBDB?

Pro:
    Simply plan tree a little.
Con:
   User's SQL is changed, and there may be other risk for planner.







### Use case/motivation

_No response_

### Related issues

_No response_

### Are you willing to submit a PR?

- [X] Yes I am willing to submit a PR!

GitHub link: https://github.com/apache/cloudberry/discussions/677

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [D] [Ideas] Optimize SQL: select distinct count(a) from t1; [cloudberry]

Reply via email to