[ https://issues.apache.org/jira/browse/HIVE-21016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16749611#comment-16749611 ]
Peter Vary commented on HIVE-21016: ----------------------------------- Since I am not absolutely confident in my knowledge of this part of the code, I would prefer to have it in the master branch only - this means we will have more testing around it before it will be released. If we find somebody who is more experienced with this part of the code that could be another story :) Thanks, Peter > Duplicate column name in GROUP BY statement causing Vertex failures > ------------------------------------------------------------------- > > Key: HIVE-21016 > URL: https://issues.apache.org/jira/browse/HIVE-21016 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 1.2.1 > Reporter: Bjorn Olsen > Assignee: Mani M > Priority: Major > > Hive queries fail with "Vertex failure" messages when the user submits a > query containing duplicate GROUP BY columns. The Hive query parser should > detect and reject this scenario with a meaningful error message, rather than > executing the query and failing with an obfuscated message. For complex > queries this can result in a lot of debugging effort, whereas a simple error > message could have saved some time. > To repeat the issue, choose any table and perform a GROUP BY with a duplicate > column name. > {{For example:}} > select count( * ), party_id from party {{group by party_id, party_id;}} > Note the duplicate column in the GROUP BY. > This will fail with messages similar to below: > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing vector batch (tag=0) 0000ffb9-5fb1-3024-922a-10cc313a7c171 > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:390) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) > ... 14 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing vector batch (tag=0) > 0000ffb9-5fb1-3024-922a-10cc313a7c171 > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:454) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:381) > ... 17 more > *Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector* -- This message was sent by Atlassian JIRA (v7.6.3#76005)