[jira] [Commented] (HIVE-16291) Hive fails when unions a parquet table with itself

Hive QA (JIRA) Fri, 24 Mar 2017 08:07:13 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-16291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15940505#comment-15940505
 ]


Hive QA commented on HIVE-16291:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12860357/HIVE-16291.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10512 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4340/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4340/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4340/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12860357 - PreCommit-HIVE-Build

> Hive fails when unions a parquet table with itself
> --------------------------------------------------
>
>                 Key: HIVE-16291
>                 URL: https://issues.apache.org/jira/browse/HIVE-16291
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Yibing Shi
>            Assignee: Yibing Shi
>         Attachments: HIVE-16291.1.patch
>
>
> Reproduce commands:
> {code:sql}
> create table tst_unin (col1 int) partitioned by (p_tdate int) stored as 
> parquet;
> insert into tst_unin partition (p_tdate=201603) values (20160312), (20160310);
> insert into tst_unin partition (p_tdate=201604) values (20160412), (20160410);
> select count(*) from (select tst_unin.p_tdate from tst_unin union all select 
> tst_unin.p_tdate from tst_unin where tst_unin.col1=20160302) t1;
> {code}
> The table is stored in Parquet format, which is a columnar file format. Hive 
> tries to push the query predicates to the table scan operators so that only 
> the needed columns are read. This is done by adding the needed column IDs 
> into job configuration with property "hive.io.file.readcolumn.ids".
> In above case, the query unions the result of 2 subqueries, which select data 
> from one same table. The first subquery doesn't need any column from Parquet 
> file, while the second subquery needs a column "col1". Hive has a bug here, 
> it finally set "hive.io.file.readcolumn.ids" to a value like "0,,0", which 
> method ColumnProjectionUtils.getReadColumnIDs cannot parse.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16291) Hive fails when unions a parquet table with itself

Reply via email to