[jira] [Commented] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files

Hive QA (JIRA) Fri, 23 Jun 2017 21:56:49 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16061801#comment-16061801
 ]


Hive QA commented on HIVE-16177:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12874345/HIVE-16177.16.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10849 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=238)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_buckets] 
(batchId=241)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[zero_rows_blobstore]
 (batchId=241)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=233)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union24] 
(batchId=125)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=217)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=178)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5762/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5762/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5762/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12874345 - PreCommit-HIVE-Build

> non Acid to acid conversion doesn't handle _copy_N files
> --------------------------------------------------------
>
>                 Key: HIVE-16177
>                 URL: https://issues.apache.org/jira/browse/HIVE-16177
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 0.14.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Blocker
>         Attachments: HIVE-16177.01.patch, HIVE-16177.02.patch, 
> HIVE-16177.04.patch, HIVE-16177.07.patch, HIVE-16177.08.patch, 
> HIVE-16177.09.patch, HIVE-16177.10.patch, HIVE-16177.11.patch, 
> HIVE-16177.14.patch, HIVE-16177.15.patch, HIVE-16177.16.patch
>
>
> {noformat}
> create table T(a int, b int) clustered by (a)  into 2 buckets stored as orc 
> TBLPROPERTIES('transactional'='false')
> insert into T(a,b) values(1,2)
> insert into T(a,b) values(1,3)
> alter table T SET TBLPROPERTIES ('transactional'='true')
> {noformat}
>     //we should now have bucket files 000001_0 and 000001_0_copy_1
> but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can 
> be copy_N files and numbers rows in each bucket from 0 thus generating 
> duplicate IDs
> {noformat}
> select ROW__ID, INPUT__FILE__NAME, a, b from T
> {noformat}
> produces 
> {noformat}
> {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/000001_0,1,2
> {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/000001_0_copy_1,1,3
> {noformat}
> [~owen.omalley], do you have any thoughts on a good way to handle this?
> attached patch has a few changes to make Acid even recognize copy_N but this 
> is just a pre-requisite.  The new UT demonstrates the issue.
> Futhermore,
> {noformat}
> alter table T compact 'major'
> select ROW__ID, INPUT__FILE__NAME, a, b from T order by b
> {noformat}
> produces 
> {noformat}
> {"transactionid":0,"bucketid":1,"rowid":0}    
> file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands....warehouse/nonacidorctbl/base_-9223372036854775808/bucket_00001
>     1       2
> {noformat}
> HIVE-16177.04.patch has TestTxnCommands.testNonAcidToAcidConversion0() 
> demonstrating this
> This is because compactor doesn't handle copy_N files either (skips them)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files

Reply via email to