[ https://issues.apache.org/jira/browse/HIVE-23048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062614#comment-17062614 ]
Peter Vary edited comment on HIVE-23048 at 3/19/20, 2:19 PM: ------------------------------------------------------------- We need the following functionality from the backend database: * Way to generate TXN_ID - sequence, or identity like stuff * Way to insert single row to TXN data, and retrieve back the generated id * Way to batch insert row to TXN data, and retrieve back the generated id-s * Way to get the last transaction id for HWM calculations Created a small test class to check the different databases, used the following versions: * Derby - 10.14.1.0 * MySQL - 5.7.23 * PostgreSQL - 11.5 * Microsoft SQL Server - 2017 GA * Oracle - XE 11g - had to use ojdbc8.jar driver (downloaded for the 19c version) to make getGeneratedKeys work for batch Here are the results: {code:java} -------- Checking Apache Derby Single statement Gen key: 1 Multi statement Gen keys: [11] -------- Checking MySQL Single statement Gen key: 1 Multi statement Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11] -------- Checking PostgreSQL Single statement Gen key: 1 Multi statement Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11] Seq state: 11 -------- Checking Microsoft SQL Server Single statement Gen key: 1 Multi statement com.microsoft.sqlserver.jdbc.SQLServerException: The statement must be executed before any results can be obtained. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:234) at com.microsoft.sqlserver.jdbc.SQLServerStatement.getGeneratedKeys(SQLServerStatement.java:2216) at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.getGeneratedKeys(HikariProxyPreparedStatement.java) at org.apache.hadoop.hive.a.runCheck(a.java:130) at org.apache.hadoop.hive.a.main(a.java:171) Seq state: 11 -------- Checking Oracle Single statement Gen key: 1 Multi statement Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11] Seq state: 11 {code} The key takeaways: * MSSQL and Derby does not support batch insert with getGeneratedKeys. Workaround could be: ** Insert them row by row * MySQL, MSSQL, Derby does not support retrieving the last transaction id. Workaround could be: ** Run specific query, like: {code:java} SELECT MAX(TXN_ID) FROM (SELECT MAX(TXN_ID) FROM TXNS UNION ALL SELECT MAX(CTC_TXNID) FROM COMPLETED_TXN_COMPONENTS) {code} was (Author: pvary): We need the following functionality from the backend database: * Way to generate TXN_ID - sequence, or identity like stuff * Way to batch insert row to TXN data, and retrieve back the generated id-s * Way to get the last transaction id for HWM calculations Created a small test class to check the different databases, used the following versions: * Derby - 10.14.1.0 * MySQL - 5.7.23 * PostgreSQL - 11.5 * Microsoft SQL Server - 2017 GA * Oracle - XE 11g - had to use ojdbc8.jar driver (downloaded for the 19c version) to make getGeneratedKeys work for batch Here are the results: {code:java} -------- Checking Apache Derby Single statement Gen key: 1 Multi statement Gen keys: [11] -------- Checking MySQL Single statement Gen key: 1 Multi statement Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11] -------- Checking PostgreSQL Single statement Gen key: 1 Multi statement Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11] Seq state: 11 -------- Checking Microsoft SQL Server Single statement Gen key: 1 Multi statement com.microsoft.sqlserver.jdbc.SQLServerException: The statement must be executed before any results can be obtained. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:234) at com.microsoft.sqlserver.jdbc.SQLServerStatement.getGeneratedKeys(SQLServerStatement.java:2216) at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.getGeneratedKeys(HikariProxyPreparedStatement.java) at org.apache.hadoop.hive.a.runCheck(a.java:130) at org.apache.hadoop.hive.a.main(a.java:171) Seq state: 11 -------- Checking Oracle Single statement Gen key: 1 Multi statement Gen keys: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11] Seq state: 11 {code} The key takeaways: * MSSQL and Derby does not support batch insert with getGeneratedKeys. Workaround could be: ** Insert them row by row * MySQL, MSSQL, Derby does not support retrieving the last transaction id. Workaround could be: ** Run specific query, like: {code:java} SELECT MAX(TXN_ID) FROM (SELECT MAX(TXN_ID) FROM TXNS UNION ALL SELECT MAX(CTC_TXNID) FROM COMPLETED_TXN_COMPONENTS) {code} > Use sequences for TXN_ID generation > ----------------------------------- > > Key: HIVE-23048 > URL: https://issues.apache.org/jira/browse/HIVE-23048 > Project: Hive > Issue Type: Bug > Reporter: Peter Vary > Assignee: Peter Vary > Priority: Major > Attachments: DbTest.java > > -- This message was sent by Atlassian Jira (v8.3.4#803005)