[jira] [Commented] (FLINK-12173) Optimize "SELECT DISTINCT" into Deduplicate with keep first row

Jim Hughes (Jira) Mon, 16 Dec 2024 13:56:08 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17906203#comment-17906203
 ]


Jim Hughes commented on FLINK-12173:
------------------------------------

Hi [~lincoln.86xy], we have not done any benchmarking yet.  I took a quick look 
at Nexmark, and I do not believe there are queries of this form.

Is using that test harness the easiest/fastest way to benchmark things?

> Optimize "SELECT DISTINCT" into Deduplicate with keep first row
> ---------------------------------------------------------------
>
>                 Key: FLINK-12173
>                 URL: https://issues.apache.org/jira/browse/FLINK-12173
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Planner
>            Reporter: Jark Wu
>            Assignee: Yiyu Tian
>            Priority: Major
>              Labels: pull-request-available
>
> The following distinct query can be optimized into deduplicate on keys "a, b, 
> c, d" and keep the first row.
> {code:sql}
> SELECT DISTINCT a, b, c, d;
> {code}
> We can optimize this query into Deduplicate to get a better performance than 
> GroupAggregate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-12173) Optimize "SELECT DISTINCT" into Deduplicate with keep first row

Reply via email to