[jira] [Commented] (HIVE-10686) java.lang.IndexOutOfBoundsException for query with rank() over(partition ...)

Hive QA (JIRA) Tue, 12 May 2015 05:32:03 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-10686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539767#comment-14539767
 ]


Hive QA commented on HIVE-10686:
--------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732211/HIVE-10686.patch

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 8921 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_global_limit
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_limit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_limit
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3860/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3860/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3860/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732211 - PreCommit-HIVE-TRUNK-Build

> java.lang.IndexOutOfBoundsException for query with rank() over(partition ...)
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-10686
>                 URL: https://issues.apache.org/jira/browse/HIVE-10686
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>         Attachments: HIVE-10686.patch
>
>
> CBO throws Index out of bound exception for TPC-DS Q70.
> Query 
> {code}
> explain
> select
>     sum(ss_net_profit) as total_sum
>    ,s_state
>    ,s_county
>    ,grouping__id as lochierarchy
>    , rank() over(partition by grouping__id, case when grouping__id == 2 then 
> s_state end order by sum(ss_net_profit)) as rank_within_parent
> from
>     store_sales ss join date_dim d1 on d1.d_date_sk = ss.ss_sold_date_sk
>     join store s on s.s_store_sk  = ss.ss_store_sk
>  where
>     d1.d_month_seq between 1193 and 1193+11
>  and s.s_state in
>              ( select s_state
>                from  (select s_state as s_state, sum(ss_net_profit),
>                              rank() over ( partition by s_state order by 
> sum(ss_net_profit) desc) as ranking
>                       from   store_sales, store, date_dim
>                       where  d_month_seq between 1193 and 1193+11
>                             and date_dim.d_date_sk = 
> store_sales.ss_sold_date_sk
>                             and store.s_store_sk  = store_sales.ss_store_sk
>                       group by s_state
>                      ) tmp1
>                where ranking <= 5
>              )
>  group by s_state,s_county with rollup
> order by
>    lochierarchy desc
>   ,case when lochierarchy = 0 then s_state end
>   ,rank_within_parent
>  limit 100
> {code}
> Original plan (correct)
> {code}
>  HiveSort(fetch=[100])
>   HiveSort(sort0=[$3], sort1=[$5], sort2=[$4], dir0=[DESC], dir1=[ASC], 
> dir2=[ASC])
>     HiveProject(total_sum=[$4], s_state=[$0], s_county=[$1], 
> lochierarchy=[$5], rank_within_parent=[rank() OVER (PARTITION BY $5, 
> when(==($5, 2), $0) ORDER BY $4 ROWS BETWEEN 2147483647 FOLLOWING AND 
> 2147483647 PRECEDING)], (tok_function when (= (tok_table_or_col lochierarchy) 
> 0) (tok_table_or_col s_state))=[when(=($5, 0), $0)])
>       HiveAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}, {}]], 
> indicator=[true], agg#0=[sum($2)], GROUPING__ID=[GROUPING__ID()])
>         HiveProject($f0=[$7], $f1=[$6], $f2=[$1])
>           HiveJoin(condition=[=($5, $2)], joinType=[inner], algorithm=[none], 
> cost=[{1177.2086187101072 rows, 0.0 cpu, 0.0 io}])
>             HiveJoin(condition=[=($3, $0)], joinType=[inner], 
> algorithm=[none], cost=[{2880430.428726483 rows, 0.0 cpu, 0.0 io}])
>               HiveProject(ss_sold_date_sk=[$0], ss_net_profit=[$21], 
> ss_store_sk=[$22])
>                 HiveTableScan(table=[[tpcds.store_sales]])
>               HiveProject(d_date_sk=[$0], d_month_seq=[$3])
>                 HiveFilter(condition=[between(false, $3, 1193, +(1193, 11))])
>                   HiveTableScan(table=[[tpcds.date_dim]])
>             HiveProject(s_store_sk=[$0], s_county=[$1], s_state=[$2])
>               SemiJoin(condition=[=($2, $3)], joinType=[inner])
>                 HiveProject(s_store_sk=[$0], s_county=[$23], s_state=[$24])
>                   HiveTableScan(table=[[tpcds.store]])
>                 HiveProject(s_state=[$0])
>                   HiveFilter(condition=[<=($1, 5)])
>                     HiveProject((tok_table_or_col s_state)=[$0], 
> rank_window_0=[rank() OVER (PARTITION BY $0 ORDER BY $1 DESC ROWS BETWEEN 
> 2147483647 FOLLOWING AND 2147483647 PRECEDING)])
>                       HiveAggregate(group=[{0}], agg#0=[sum($1)])
>                         HiveProject($f0=[$6], $f1=[$1])
>                           HiveJoin(condition=[=($5, $2)], joinType=[inner], 
> algorithm=[none], cost=[{1177.2086187101072 rows, 0.0 cpu, 0.0 io}])
>                             HiveJoin(condition=[=($3, $0)], joinType=[inner], 
> algorithm=[none], cost=[{2880430.428726483 rows, 0.0 cpu, 0.0 io}])
>                               HiveProject(ss_sold_date_sk=[$0], 
> ss_net_profit=[$21], ss_store_sk=[$22])
>                                 HiveTableScan(table=[[tpcds.store_sales]])
>                               HiveProject(d_date_sk=[$0], d_month_seq=[$3])
>                                 HiveFilter(condition=[between(false, $3, 
> 1193, +(1193, 11))])
>                                   HiveTableScan(table=[[tpcds.date_dim]])
>                             HiveProject(s_store_sk=[$0], s_state=[$24])
>                               HiveTableScan(table=[[tpcds.store]])
> {code}
> Plan after fixTopOBSchema (incorrect)
> {code}
>  HiveSort(fetch=[100])
>   HiveSort(sort0=[$3], sort1=[$5], sort2=[$4], dir0=[DESC], dir1=[ASC], 
> dir2=[ASC])
>     HiveProject(total_sum=[$4], s_state=[$0], s_county=[$1], 
> lochierarchy=[$5], rank_within_parent=[rank() OVER (PARTITION BY $5, 
> when(==($5, 2), $0) ORDER BY $4 ROWS BETWEEN 2147483647 FOLLOWING AND 
> 2147483647 PRECEDING)])
>       HiveAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}, {}]], 
> indicator=[true], agg#0=[sum($2)], GROUPING__ID=[GROUPING__ID()])
>         HiveProject($f0=[$7], $f1=[$6], $f2=[$1])
>           HiveJoin(condition=[=($5, $2)], joinType=[inner], algorithm=[none], 
> cost=[{1177.2086187101072 rows, 0.0 cpu, 0.0 io}])
>             HiveJoin(condition=[=($3, $0)], joinType=[inner], 
> algorithm=[none], cost=[{2880430.428726483 rows, 0.0 cpu, 0.0 io}])
>               HiveProject(ss_sold_date_sk=[$0], ss_net_profit=[$21], 
> ss_store_sk=[$22])
>                 HiveTableScan(table=[[tpcds.store_sales]])
>               HiveProject(d_date_sk=[$0], d_month_seq=[$3])
>                 HiveFilter(condition=[between(false, $3, 1193, +(1193, 11))])
>                   HiveTableScan(table=[[tpcds.date_dim]])
>             HiveProject(s_store_sk=[$0], s_county=[$1], s_state=[$2])
>               SemiJoin(condition=[=($2, $3)], joinType=[inner])
>                 HiveProject(s_store_sk=[$0], s_county=[$23], s_state=[$24])
>                   HiveTableScan(table=[[tpcds.store]])
>                 HiveProject(s_state=[$0])
>                   HiveFilter(condition=[<=($1, 5)])
>                     HiveProject((tok_table_or_col s_state)=[$0], 
> rank_window_0=[rank() OVER (PARTITION BY $0 ORDER BY $1 DESC ROWS BETWEEN 
> 2147483647 FOLLOWING AND 2147483647 PRECEDING)])
>                       HiveAggregate(group=[{0}], agg#0=[sum($1)])
>                         HiveProject($f0=[$6], $f1=[$1])
>                           HiveJoin(condition=[=($5, $2)], joinType=[inner], 
> algorithm=[none], cost=[{1177.2086187101072 rows, 0.0 cpu, 0.0 io}])
>                             HiveJoin(condition=[=($3, $0)], joinType=[inner], 
> algorithm=[none], cost=[{2880430.428726483 rows, 0.0 cpu, 0.0 io}])
>                               HiveProject(ss_sold_date_sk=[$0], 
> ss_net_profit=[$21], ss_store_sk=[$22])
>                                 HiveTableScan(table=[[tpcds.store_sales]])
>                               HiveProject(d_date_sk=[$0], d_month_seq=[$3])
>                                 HiveFilter(condition=[between(false, $3, 
> 1193, +(1193, 11))])
>                                   HiveTableScan(table=[[tpcds.date_dim]])
>                             HiveProject(s_store_sk=[$0], s_state=[$24])
>                               HiveTableScan(table=[[tpcds.store]])
> {code}
> Exception 
> {code}
> 15/04/14 02:42:52 [main]: ERROR parse.CalcitePlanner: CBO failed, skipping 
> CBO.
> java.lang.IndexOutOfBoundsException: Index: 5, Size: 5
>       at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>       at java.util.ArrayList.get(ArrayList.java:411)
>       at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitInputRef(ASTConverter.java:395)
>       at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitInputRef(ASTConverter.java:372)
>       at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>       at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitCall(ASTConverter.java:543)
>       at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitCall(ASTConverter.java:372)
>       at org.apache.calcite.rex.RexCall.accept(RexCall.java:107)
>       at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitCall(ASTConverter.java:543)
>       at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitCall(ASTConverter.java:372)
>       at org.apache.calcite.rex.RexCall.accept(RexCall.java:107)
>       at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertOBToASTNode(ASTConverter.java:252)
>       at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:208)
>       at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:98)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:239)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10003)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:202)
>       at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
>       at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>       at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
>       at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>       at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>       at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>       at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409)
>       at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714)
>       at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>       at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10686) java.lang.IndexOutOfBoundsException for query with rank() over(partition ...)

Reply via email to