[jira] [Updated] (HIVE-7882) Multiple occurrences of same aggregate function with different casing results in error

Pala M Muthaia (JIRA) Tue, 26 Aug 2014 00:39:07 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Pala M Muthaia updated HIVE-7882:
---------------------------------

    Description: 
The query 

select sum(a), SUM(a) from t;

throws error shown in the stack below but the same query with matching case for 
the aggregate function succeeds

select sum(a), sum(a) from t;

This is a regression from Hive 0.12. 

While the above is a contrived example to showcase the behavior, the use case 
not artificial. In our case, we had "select sum(a) as x, SUM(a) + sum(b) as 
total...." in a complex query.

This seems related to fix for HIVE-3107, which de-duplicated occurrences of 
aggregate functions while generating reduce sink operator. However, the number 
of elements in the aggregationTrees dictionary, which is not similarly 
deduplicated, was used to infer total number of aggregate function in the 
query, leading to a mismatch causing this error.

Error stack below.
-----
FAILED: IndexOutOfBoundsException Index: 1, Size: 1
14/08/20 14:52:33 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 1, 
Size: 1
java.lang.IndexOutOfBoundsException: Index: 18, Size: 18
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4121)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5098)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8154)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8994)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8882)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8903)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9271)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:428)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:323)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:984)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1051)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:917)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:907)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:270)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:222)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:794)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:688)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:627)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

  was:
The query 

select sum(x), SUM(x) from t;

throws error shown in the stack below but the same query with matching case for 
the aggregate function succeeds

select sum(x), sum(x) from t;

This is a regression from Hive 0.12. 

While the above is a contrived example to showcase the behavior, the use case 
not artificial. In our case, we had "select sum(x) as x, SUM(x) + sum(y) as 
total...." in a complex query.

This seems related to fix for HIVE-3107, which de-duplicated occurrences of 
aggregate functions while generating reduce sink operator. However, the number 
of elements in the aggregationTrees dictionary, which is not similarly 
deduplicated, was used to infer total number of aggregate function in the 
query, leading to a mismatch causing this error.

Error stack below.
-----
FAILED: IndexOutOfBoundsException Index: 1, Size: 1
14/08/20 14:52:33 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 1, 
Size: 1
java.lang.IndexOutOfBoundsException: Index: 18, Size: 18
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4121)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5098)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8154)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8994)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8882)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8903)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9271)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:428)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:323)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:984)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1051)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:917)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:907)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:270)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:222)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:794)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:688)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:627)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)


> Multiple occurrences of same aggregate function with different casing results 
> in error
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-7882
>                 URL: https://issues.apache.org/jira/browse/HIVE-7882
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.13.1
>            Reporter: Pala M Muthaia
>            Priority: Minor
>
> The query 
> select sum(a), SUM(a) from t;
> throws error shown in the stack below but the same query with matching case 
> for the aggregate function succeeds
> select sum(a), sum(a) from t;
> This is a regression from Hive 0.12. 
> While the above is a contrived example to showcase the behavior, the use case 
> not artificial. In our case, we had "select sum(a) as x, SUM(a) + sum(b) as 
> total...." in a complex query.
> This seems related to fix for HIVE-3107, which de-duplicated occurrences of 
> aggregate functions while generating reduce sink operator. However, the 
> number of elements in the aggregationTrees dictionary, which is not similarly 
> deduplicated, was used to infer total number of aggregate function in the 
> query, leading to a mismatch causing this error.
> Error stack below.
> -----
> FAILED: IndexOutOfBoundsException Index: 1, Size: 1
> 14/08/20 14:52:33 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 
> 1, Size: 1
> java.lang.IndexOutOfBoundsException: Index: 18, Size: 18
> at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> at java.util.ArrayList.get(ArrayList.java:322)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4121)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5098)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8154)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8994)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8882)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8903)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9271)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:428)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:323)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:984)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1051)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:917)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:907)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:270)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:222)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:794)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:688)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:627)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:208)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7882) Multiple occurrences of same aggregate function with different casing results in error

Reply via email to