[ https://issues.apache.org/jira/browse/HIVE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt McCline updated HIVE-10885: -------------------------------- Attachment: HIVE-10885.02.patch > with vectorization enabled join operation involving interval_day_time fails > --------------------------------------------------------------------------- > > Key: HIVE-10885 > URL: https://issues.apache.org/jira/browse/HIVE-10885 > Project: Hive > Issue Type: Bug > Affects Versions: 1.2.0 > Reporter: Jagruti Varia > Assignee: Matt McCline > Attachments: HIVE-10885.01.patch, HIVE-10885.02.patch > > > When vectorization is on, join operation involving interval_day_time type > throws following error: > {noformat} > Status: Failed > Vertex failed, vertexName=Map 2, vertexId=vertex_1432858236614_0247_1_01, > diagnostics=[Task failed, taskId=task_1432858236614_0247_1_01_000000, > diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running > task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator > initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > ... 14 more > Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for > interval_day_time > at > org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214) > ... 15 more > ], TaskAttempt 1 failed, info=[Error: Failure while running > task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator > initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > ... 14 more > Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for > interval_day_time > at > org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214) > ... 15 more > ], TaskAttempt 2 failed, info=[Error: Failure while running > task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator > initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > ... 14 more > Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for > interval_day_time > at > org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214) > ... 15 more > ], TaskAttempt 3 failed, info=[Error: Failure while running > task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator > initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > ... 14 more > Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for > interval_day_time > at > org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214) > ... 15 more > {noformat} > query ran: > {noformat} > select > v1.s, > v2.s, > v1.intrvl1 > from > ( select > s, > (cast(dt as date) - cast(ts as date)) as intrvl1 > from > vectortab10korc ) v1 > join > ( > select > s , > (cast(dt as date) - cast(ts as date)) as intrvl2 > from > vectorparttab10korc > ) v2 > on v1.intrvl1 = v2.intrvl2 > and v1.s = v2.s; > {noformat} > explain plan: > {noformat} > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Map 2 <- Map 1 (BROADCAST_EDGE) > DagName: hrt_qa_20150601024305_7745bc8f-169f-45c6-8856-7391eef0d819:3 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: vectortab10korc > filterExpr: s is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 4597592 Basic stats: > COMPLETE Column stats: PARTIAL > Filter Operator > predicate: s is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 1340000 Basic > stats: COMPLETE Column stats: PARTIAL > Select Operator > expressions: s (type: string), (dt - CAST( ts AS DATE)) > (type: interval_day_time) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 10000 Data size: 940000 Basic > stats: COMPLETE Column stats: PARTIAL > Filter Operator > predicate: _col1 is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 940000 Basic > stats: COMPLETE Column stats: PARTIAL > Reduce Output Operator > key expressions: _col1 (type: interval_day_time), > _col0 (type: string) > sort order: ++ > Map-reduce partition columns: _col1 (type: > interval_day_time), _col0 (type: string) > Statistics: Num rows: 10000 Data size: 940000 Basic > stats: COMPLETE Column stats: PARTIAL > Select Operator > expressions: _col0 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 10000 Data size: 940000 Basic > stats: COMPLETE Column stats: PARTIAL > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5000 Data size: 470000 > Basic stats: COMPLETE Column stats: PARTIAL > Dynamic Partitioning Event Operator > Target Input: vectorparttab10korc > Partition key expr: s > Statistics: Num rows: 5000 Data size: 470000 > Basic stats: COMPLETE Column stats: PARTIAL > Target column: s > Target Vertex: Map 2 > Execution mode: vectorized > Map 2 > Map Operator Tree: > TableScan > alias: vectorparttab10korc > filterExpr: s is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 3656191 Basic stats: > COMPLETE Column stats: PARTIAL > Select Operator > expressions: s (type: string), (dt - CAST( ts AS DATE)) > (type: interval_day_time) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 10000 Data size: 1840000 Basic > stats: COMPLETE Column stats: PARTIAL > Filter Operator > predicate: _col1 is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 1840000 Basic > stats: COMPLETE Column stats: PARTIAL > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col1 (type: interval_day_time), _col0 (type: > string) > 1 _col1 (type: interval_day_time), _col0 (type: > string) > outputColumnNames: _col0, _col1, _col2 > input vertices: > 0 Map 1 > Statistics: Num rows: 344 Data size: 95632 Basic > stats: COMPLETE Column stats: PARTIAL > HybridGraceHashJoin: true > Select Operator > expressions: _col0 (type: string), _col2 (type: > string), _col1 (type: interval_day_time) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 344 Data size: 95632 Basic > stats: COMPLETE Column stats: PARTIAL > File Output Operator > compressed: false > Statistics: Num rows: 344 Data size: 95632 Basic > stats: COMPLETE Column stats: PARTIAL > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Execution mode: vectorized > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > Time taken: 0.402 seconds, Fetched: 91 row(s) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)