[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vaibhav Gumashta updated HIVE-12947: ------------------------------------ Target Version/s: 2.1.0, 2.0.0, 1.3.0 (was: 2.0.0, 2.1.0) > SMB join in tez has ClassCastException when container reuse is on > ----------------------------------------------------------------- > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez > Affects Versions: 2.0.0 > Reporter: Vikram Dixit K > Assignee: Vikram Dixit K > Priority: Critical > Fix For: 2.0.0 > > Attachments: HIVE-12947.1.patch, HIVE-12947.2.patch, > HIVE-12947.3.patch, HIVE-12947.4.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)