[ https://issues.apache.org/jira/browse/HIVE-26968?focusedWorklogId=853567&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853567 ]
ASF GitHub Bot logged work on HIVE-26968: ----------------------------------------- Author: ASF GitHub Bot Created on: 29/Mar/23 06:24 Start Date: 29/Mar/23 06:24 Worklog Time Spent: 10m Work Description: ngsg commented on PR #3981: URL: https://github.com/apache/hive/pull/3981#issuecomment-1488012344 Hello @zabetak. I have added a new qfile, which validates my PR. In a nutshell, this qfile submits the same query twice while varying the value of hive.optimize.shared.work.dppunion. I checked that current Hive produces different results as I described in the JIRA issue (https://issues.apache.org/jira/browse/HIVE-26968). Could you please review the changes? Thank you. Issue Time Tracking ------------------- Worklog Id: (was: 853567) Time Spent: 40m (was: 0.5h) > SharedWorkOptimizer merges TableScan operators that have different DPP parents > ------------------------------------------------------------------------------ > > Key: HIVE-26968 > URL: https://issues.apache.org/jira/browse/HIVE-26968 > Project: Hive > Issue Type: Sub-task > Affects Versions: 4.0.0-alpha-2 > Reporter: Seonggon Namgung > Assignee: Seonggon Namgung > Priority: Critical > Labels: hive-4.0.0-must, pull-request-available > Attachments: TPC-DS Query64 OperatorGraph.pdf > > Time Spent: 40m > Remaining Estimate: 0h > > SharedWorkOptimizer merges TableScan operators that have different DPP > parents, which leads to the creation of semantically wrong query plan. > In our environment, running TPC-DS query64 on 1TB Iceberg format table > returns no rows because of this problem. (The correct result has 7094 rows.) > We use hive.optimize.shared.work=true, > hive.optimize.shared.work.extended=true, and > hive.optimize.shared.work.dppunion=false to reproduce the bug. -- This message was sent by Atlassian Jira (v8.20.10#820010)