[ 
https://issues.apache.org/jira/browse/HIVE-23048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062614#comment-17062614
 ] 

Peter Vary edited comment on HIVE-23048 at 3/19/20, 2:19 PM:
-------------------------------------------------------------

We need the following functionality from the backend database:
 * Way to generate TXN_ID - sequence, or identity like stuff
 * Way to insert single row to TXN data, and retrieve back the generated id
 * Way to batch insert row to TXN data, and retrieve back the generated id-s
 * Way to get the last transaction id for HWM calculations

Created a small test class to check the different databases, used the following 
versions:
 * Derby - 10.14.1.0
 * MySQL - 5.7.23
 * PostgreSQL - 11.5
 * Microsoft SQL Server - 2017 GA
 * Oracle - XE 11g - had to use ojdbc8.jar driver (downloaded for the 19c 
version) to make getGeneratedKeys work for batch

Here are the results:
{code:java}
--------
Checking Apache Derby
Single statement
Gen key: 1
Multi statement
Gen keys: [11]
--------
Checking MySQL
Single statement
Gen key: 1
Multi statement
Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
--------
Checking PostgreSQL
Single statement
Gen key: 1
Multi statement
Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Seq state: 11
--------
Checking Microsoft SQL Server
Single statement
Gen key: 1
Multi statement
com.microsoft.sqlserver.jdbc.SQLServerException: The statement must be executed 
before any results can be obtained.
        at 
com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:234)
        at 
com.microsoft.sqlserver.jdbc.SQLServerStatement.getGeneratedKeys(SQLServerStatement.java:2216)
        at 
com.zaxxer.hikari.pool.HikariProxyPreparedStatement.getGeneratedKeys(HikariProxyPreparedStatement.java)
        at org.apache.hadoop.hive.a.runCheck(a.java:130)
        at org.apache.hadoop.hive.a.main(a.java:171)
Seq state: 11
--------
Checking Oracle
Single statement
Gen key: 1
Multi statement
Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Seq state: 11
{code}
The key takeaways:
 * MSSQL and Derby does not support batch insert with getGeneratedKeys. 
Workaround could be:
 ** Insert them row by row
 * MySQL, MSSQL, Derby does not support retrieving the last transaction id. 
Workaround could be:
 ** Run specific query, like:
{code:java}
SELECT MAX(TXN_ID) FROM (SELECT MAX(TXN_ID) FROM TXNS UNION ALL SELECT 
MAX(CTC_TXNID) FROM COMPLETED_TXN_COMPONENTS)
{code}


was (Author: pvary):
We need the following functionality from the backend database:
 * Way to generate TXN_ID - sequence, or identity like stuff
 * Way to batch insert row to TXN data, and retrieve back the generated id-s
 * Way to get the last transaction id for HWM calculations

Created a small test class to check the different databases, used the following 
versions:
 * Derby - 10.14.1.0
 * MySQL - 5.7.23
 * PostgreSQL - 11.5
 * Microsoft SQL Server - 2017 GA
 * Oracle - XE 11g - had to use ojdbc8.jar driver (downloaded for the 19c 
version) to make getGeneratedKeys work for batch

Here are the results:
{code:java}
--------
Checking Apache Derby
Single statement
Gen key: 1
Multi statement
Gen keys: [11]
--------
Checking MySQL
Single statement
Gen key: 1
Multi statement
Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
--------
Checking PostgreSQL
Single statement
Gen key: 1
Multi statement
Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Seq state: 11
--------
Checking Microsoft SQL Server
Single statement
Gen key: 1
Multi statement
com.microsoft.sqlserver.jdbc.SQLServerException: The statement must be executed 
before any results can be obtained.
        at 
com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:234)
        at 
com.microsoft.sqlserver.jdbc.SQLServerStatement.getGeneratedKeys(SQLServerStatement.java:2216)
        at 
com.zaxxer.hikari.pool.HikariProxyPreparedStatement.getGeneratedKeys(HikariProxyPreparedStatement.java)
        at org.apache.hadoop.hive.a.runCheck(a.java:130)
        at org.apache.hadoop.hive.a.main(a.java:171)
Seq state: 11
--------
Checking Oracle
Single statement
Gen key: 1
Multi statement
Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Seq state: 11
{code}
The key takeaways:
 * MSSQL and Derby does not support batch insert with getGeneratedKeys. 
Workaround could be:
 ** Insert them row by row
 * MySQL, MSSQL, Derby does not support retrieving the last transaction id. 
Workaround could be:
 ** Run specific query, like:
{code:java}
SELECT MAX(TXN_ID) FROM (SELECT MAX(TXN_ID) FROM TXNS UNION ALL SELECT 
MAX(CTC_TXNID) FROM COMPLETED_TXN_COMPONENTS)
{code}

> Use sequences for TXN_ID generation
> -----------------------------------
>
>                 Key: HIVE-23048
>                 URL: https://issues.apache.org/jira/browse/HIVE-23048
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Major
>         Attachments: DbTest.java
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to