Saket Saurabh created HIVE-14366:
------------------------------------
Summary: Conversion of a Non-ACID table to an ACID table produces
non-unique primary keys
Key: HIVE-14366
URL: https://issues.apache.org/jira/browse/HIVE-14366
Project: Hive
Issue Type: Bug
Components: Transactions
Reporter: Saket Saurabh
When a Non-ACID table is converted to an ACID table, the primary key consisting
of (original transaction id, bucket_id, row_id) is not generated uniquely.
Currently, the row_id is always set to 0 for most rows. This leads to
correctness issue for such tables.
Quickest way to reproduce is to add the following unit test to
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
{code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid}
@Test
public void testOriginalReader() throws Exception {
FileSystem fs = FileSystem.get(hiveConf);
FileStatus[] status;
// 1. Insert five rows to Non-ACID table.
runStatementOnDriver("insert into " + Table.NONACIDORCTBL + "(a,b)
values(1,2),(3,4),(5,6),(7,8),(9,10)");
// 2. Convert NONACIDORCTBL to ACID table.
runStatementOnDriver("alter table " + Table.NONACIDORCTBL + " SET
TBLPROPERTIES ('transactional'='true')");
// 3. Perform a major compaction.
runStatementOnDriver("alter table "+ Table.NONACIDORCTBL + " compact
'MAJOR'");
runWorker(hiveConf);
// 3. Perform a delete.
runStatementOnDriver("delete from " + Table.NONACIDORCTBL + " where a = 1");
// Now do a projection should have (3,4) (5,6),(7,8),(9,10) only since
(1,2) has been deleted.
List<String> rs = runStatementOnDriver("select a,b from " +
Table.NONACIDORCTBL + " order by a,b");
int[][] resultData = new int[][] {{3,4}, {5,6}, {7,8}, {9,10}};
Assert.assertEquals(stringifyValues(resultData), rs);
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)