[ 
https://issues.apache.org/jira/browse/HIVE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-20868:
---------------------------
    Description: 
In MapRecordProcessor::getFinalOp() due to external cause(not known), the 
TezDummyStoreOperator may have MergeJoin Op as child intermittently. Due to 
this, the fetchDone remains set to true for the DummyOp which was set by 
previous task. Ideally, fetchDone should be reset for each task. This 
eventually leads to the join op skip rows from that dummy op resulting in wrong 
results.

Good init order

{code}
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
child Ops = TS[3] (core)
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
child Ops = FIL[24]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
child Ops = SEL[5]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
child Ops = DUMMY_STORE[45]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: Iterating 
children of dummy op DUMMY_STORE[45]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
returns DUMMY_STORE[45]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
InitProcessor : setting fetchDone to false
{code}

Bad init order 

{code}
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = TS[3] (core)
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = FIL[24]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = SEL[5]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = DUMMY_STORE[45]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  Iterating 
children of dummy op DUMMY_STORE[45]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  Child of 
Dummy Op MERGEJOIN[44]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = MERGEJOIN[44]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = SEL[13]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = RS[14]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
returns RS[14]
{code}

  was:In MapRecordProcessor::getFinalOp() due to external cause(not known), the 
TezDummyStoreOperator may have MergeJoin Op as child intermittently. Due to 
this, the fetchDone remains set to true for the DummyOp which was set by 
previous task. Ideally, fetchDone should be reset for each task. This 
eventually leads to the join op skip rows from that dummy op resulting in wrong 
results.


> SMB Join fails intermittently when TezDummyOperator has child op in 
> getFinalOp in MapRecordProcessor
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-20868
>                 URL: https://issues.apache.org/jira/browse/HIVE-20868
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Deepak Jaiswal
>            Assignee: Deepak Jaiswal
>            Priority: Major
>         Attachments: HIVE-20868.1.patch
>
>
> In MapRecordProcessor::getFinalOp() due to external cause(not known), the 
> TezDummyStoreOperator may have MergeJoin Op as child intermittently. Due to 
> this, the fetchDone remains set to true for the DummyOp which was set by 
> previous task. Ideally, fetchDone should be reset for each task. This 
> eventually leads to the join op skip rows from that dummy op resulting in 
> wrong results.
> Good init order
> {code}
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp child Ops = TS[3] (core)
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp child Ops = FIL[24]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp child Ops = SEL[5]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp child Ops = DUMMY_STORE[45]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: Iterating 
> children of dummy op DUMMY_STORE[45]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp returns DUMMY_STORE[45]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> InitProcessor : setting fetchDone to false
> {code}
> Bad init order 
> {code}
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = TS[3] (core)
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = FIL[24]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = SEL[5]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = DUMMY_STORE[45]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> Iterating children of dummy op DUMMY_STORE[45]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  Child of 
> Dummy Op MERGEJOIN[44]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = MERGEJOIN[44]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = SEL[13]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = RS[14]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp returns RS[14]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to