[jira] [Commented] (HIVE-14053) Hive should report that primary keys can't be null.

Hive QA (JIRA) Mon, 19 Dec 2016 19:48:30 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-14053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15763118#comment-15763118
 ]


Hive QA commented on HIVE-14053:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12843985/HIVE-14053.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10823 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely 
timed out) (batchId=251)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sort_array] 
(batchId=59)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=93)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2644/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2644/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2644/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12843985 - PreCommit-HIVE-Build

> Hive should report that primary keys can't be null.
> ---------------------------------------------------
>
>                 Key: HIVE-14053
>                 URL: https://issues.apache.org/jira/browse/HIVE-14053
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Carter Shanklin
>            Assignee: Pengcheng Xiong
>            Priority: Minor
>         Attachments: HIVE-14053.01.patch, HIVE-14053.02.patch
>
>
> HIVE-13076 introduces "rely novalidate" primary and foreign keys to Hive. 
> With the right driver in place, tools like Tableau can do join elimination 
> and queries can run much faster.
> Some gaps remain, currently getAttributes() in HiveDatabaseMetaData doesn't 
> work quite right for keys. In particular, primary keys by definition are not 
> null and the metadata should reflect this for improved join elimination.
> In this example that uses the TPC-H schema and its constraints, we sum 
> l_extendedprice and group by l_shipmode. This query should not use more than 
> just the lineitem table.
> With all the constraints in place, Tableau generates this query:
> {code}
> SELECT `lineitem`.`l_shipmode` AS `l_shipmode`,
>   SUM(`lineitem`.`l_extendedprice`) AS `sum_l_extendedprice_ok`
> FROM `tpch_bin_flat_orc_2`.`lineitem` `lineitem`
>   JOIN `tpch_bin_flat_orc_2`.`orders` `orders` ON (`lineitem`.`l_orderkey` = 
> `orders`.`o_orderkey`)
>   JOIN `tpch_bin_flat_orc_2`.`customer` `customer` ON (`orders`.`o_custkey` = 
> `customer`.`c_custkey`)
>   JOIN `tpch_bin_flat_orc_2`.`nation` `nation` ON (`customer`.`c_nationkey` = 
> `nation`.`n_nationkey`)
> WHERE ((((NOT (`lineitem`.`l_partkey` IS NULL)) AND (NOT 
> (`lineitem`.`l_suppkey` IS NULL))) AND ((NOT (`lineitem`.`l_partkey` IS 
> NULL)) AND (NOT (`lineitem`.`l_suppkey` IS NULL)))) AND (NOT 
> (`nation`.`n_regionkey` IS NULL)))
> {code}
> Since these are the primary keys the denormalization and the where condition 
> is unnecessary and this sort of query can be a lot faster by just accessing 
> the lineitem table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14053) Hive should report that primary keys can't be null.

Reply via email to