[ 
https://issues.apache.org/jira/browse/HIVE-26968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729297#comment-17729297
 ] 

Seonggon Namgung commented on HIVE-26968:
-----------------------------------------

[~zabetak]  If my memory serves me right, there is a slight difference in DPP 
optimization between native and non-native (e.g. iceberg) tables. So I guess 
that hive.optimize.shared.work.dppunion matters only when we reproduce this 
issue on iceberg table.
Since the problem comes from SharedWorkOptimizer rather than Iceberg related 
things, I think the new qfile is sufficient to test this issue.

> SharedWorkOptimizer merges TableScan operators that have different DPP parents
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-26968
>                 URL: https://issues.apache.org/jira/browse/HIVE-26968
>             Project: Hive
>          Issue Type: Sub-task
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Seonggon Namgung
>            Assignee: Seonggon Namgung
>            Priority: Critical
>              Labels: hive-4.0.0-must, pull-request-available
>         Attachments: TPC-DS Query64 OperatorGraph.pdf
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> SharedWorkOptimizer merges TableScan operators that have different DPP 
> parents, which leads to the creation of semantically wrong query plan.
> In our environment, running TPC-DS query64 on 1TB Iceberg format table 
> returns no rows  because of this problem. (The correct result has 7094 rows.)
> We use hive.optimize.shared.work=true, 
> hive.optimize.shared.work.extended=true, and 
> hive.optimize.shared.work.dppunion=false to reproduce the bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to