Yes I came across this back in April trying to load 1.7 million rows from an RDBMS via SAP replication server into Hive
My notes were “Trying to sync a table from ASE--> RS --> to Hadoop via DIRECT LOAD. The source has 1.7 million rows and is populating Hive table. However, I only get around 50K rows in Hive table before MapReduce jobs gives up and get killed. I have turned on concurrency and use an Oracle database as metastore. Data I believe is delivered in bulk through files show as rs_temp__nnn below. I thought that by turning concurrency on in Hive, I would have resolved the problem. 2015-04-16 17:04:34,773 WARN [pool-3-thread-199]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,784 INFO [pool-3-thread-197]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(713)) - 158: source:127.0.0.1 get_table : db=asehadoop tbl=rs_temp__0x2aaab81d6ab0_t 2015-04-16 17:04:34,784 INFO [pool-3-thread-197]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(339)) - ugi=hduser ip=127.0.0.1 cmd=source:127.0.0.1 get_table : db=asehadoop tbl=rs_temp__0x2aaab81d6ab0_t 2015-04-16 17:04:34,785 WARN [pool-3-thread-199]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,798 WARN [pool-3-thread-154]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,799 WARN [pool-3-thread-199]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,808 INFO [pool-3-thread-198]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(713)) - 162: source:127.0.0.1 get_table : db=asehadoop tbl=rs_temp__0x2aaab804eda0_t 2015-04-16 17:04:34,809 INFO [pool-3-thread-198]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(339)) - ugi=hduser ip=127.0.0.1 cmd=source:127.0.0.1 get_table : db=asehadoop tbl=rs_temp__0x2aaab804eda0_t 2015-04-16 17:04:34,810 WARN [pool-3-thread-199]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,813 WARN [pool-3-thread-154]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,827 ERROR [pool-3-thread-199]: txn.TxnHandler (TxnHandler.java:detectDeadlock(931)) - Too many repeated deadlocks in unlock, giving up. 2015-04-16 17:04:34,835 WARN [pool-3-thread-154]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,839 ERROR [pool-3-thread-199]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(141)) - org.apache.thrift.TException: MetaException(message:Unable to update transaction database java.sql.SQLException: ORA-08177: can't serialize access for this transaction Now that ORA-08177 means that the transaction ordering of bulk data from RS . According to the docs, ORA-08177 can be caused only by serializable transactions. It means that a row which the serializable transaction is trying to modify was modified by another transaction after the serializable transaction has begun. “ After a couple of weeks I came up with the following approach “OK guys, Some good news Sounds like setting these two parameters helps! 1. -- -- Parameter "mat_load_tran_size", Default: 10000, specifies the optimal transaction size or batch size for the initial copying of primary data to the replicate table during direct load materialization. alter connection to hiveserver2.asehadoop set mat_load_tran_size to "50000" go -- Parameter "max_mat_load_threads", Default: 5, specifies the maximum number of load threads for each table being materialized. alter connection to hiveserver2.asehadoop set max_mat_load_threads to "1" Makes things work without falling over 2. Need to have concurrency enabled in Hive metastore. Mine is on Oracle. You need to run separate sql against it one labelled like hive-txn-schema-0.14.0.oracle.sql after the basic one hive-schema-0.14.0.oracle.sql 3. Make sure that concurrency is enabled in Hive. hive.support.concurrency <https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.support.concurrency> is false by default 4. Once concurrency in Hive is enabled, you need to install and run Apache zookeeper <https://zookeeper.apache.org/> for distributed lock management otherwise you are going to encounter deadlock or serialisation issues in your metadata as below 2015-04-16 17:04:34,785 WARN [pool-3-thread-199]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,798 WARN [pool-3-thread-154]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,799 WARN [pool-3-thread-199]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,808 INFO [pool-3-thread-198]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(713)) - 162: source:127.0.0.1 get_table : db=asehadoop tbl=rs_temp__0x2aaab804eda0_t 2015-04-16 17:04:34,809 INFO [pool-3-thread-198]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(339)) - ugi=hduser ip=127.0.0.1 cmd=source:127.0.0.1 get_table : db=asehadoop tbl=rs_temp__0x2aaab804eda0_t 2015-04-16 17:04:34,810 WARN [pool-3-thread-199]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,813 WARN [pool-3-thread-154]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,827 ERROR [pool-3-thread-199]: txn.TxnHandler (TxnHandler.java:detectDeadlock(931)) - Too many repeated deadlocks in unlock, giving up. 2015-04-16 17:04:34,835 WARN [pool-3-thread-154]: txn.TxnHandler (TxnHandler.java:detectDeadlock(928)) - Deadlock detected in unlock, trying again. 2015-04-16 17:04:34,839 ERROR [pool-3-thread-199]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(141)) - org.apache.thrift.TException: MetaException(message:Unable to update transaction database java.sql.SQLException: ORA-08177: can't serialize access for this transaction 5. I believe the best way is to add a searchable column (PK) to repdef and replicate a sub-range of rows by where clause say ½ million rows at a time OK just right now I managed to get ½ million rows into hive pretty quickly. Source table in ASE 1> select count(1) from t 2> go ----------- 500000 (1 row affected) In Replicate database asehaddp in Hive hive> use asehadoop; OK hive> select count(1) from t; Query ID = hduser_20150429223737_1c0522a4-a7d2-4b71-873a-49e698e4c17a Starting Job = job_1430341149959_0052, Tracking URL = http://rhes564:8088/proxy/application_1430341149959_0052/ Hadoop job information for Stage-1: number of mappers: 5; number of reducers: 1 2015-04-29 22:37:47,737 Stage-1 map = 0%, reduce = 0% 2015-04-29 22:37:55,028 Stage-1 map = 20%, reduce = 0%, Cumulative CPU 2.92 sec 2015-04-29 22:37:56,062 Stage-1 map = 60%, reduce = 0%, Cumulative CPU 8.95 sec 2015-04-29 22:38:01,262 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 11.8 sec 2015-04-29 22:38:02,295 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 13.28 sec 2015-04-29 22:38:03,335 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 14.52 sec OK 500000 I really need to go back and retry these tests Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data on ASE 15 http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4 Publications due shortly: Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8 Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. From: Steve Howard [mailto:[email protected]] Sent: 24 September 2015 14:53 To: [email protected] Subject: Re: ORA-8177 with Hive transactions All, We continue to struggle with this. We *never* get the lock, and found one issue in which the retry logic gets in an infinite loop. We submitted a JIRA for that (https://issues.apache.org/jira/browse/HIVE-11934), and patched our version (HDP 2.3, Hive 1.2.1) with a fix in which the deadlockCount variable is no longer managed in the lock() method. That works, but we still couldn't get the lock, and the exception was thrown after ten retries. At least we knew it was broken earlier ;) We have made the changes to the HIVE transaction tables to enable ROWDEPENDENCIES, but are still plagued with serialization errors that are never resolved. We have only a single writer, as the Hive database environment is used as a target for an existing EDW dataset. The job loader is the only one making changes. However, it is for analytics, so we have a lot of readers. We have considered changing the TRANSACTION_SERIALIZABLE for the dbConn() method call in the TxnHandler class to READ_COMMITTED, as Oracle provides consistent reads. Of course, the serialization exception is thrown when one thread (a read or a writer I guess) attempts to lock a hive table (or in our case, several hundred daily hive table partitions) attempts to update the row, and another thread has changed and committed it in the meantime. Unless I missing it, this will always be an issue since we have readers and writers and each appears to take a lock. If we know we will have a single writer, the largest risk is that the reader thinks the data hasn't changed, when it has. For our needs, that isn't a huge issue. Are we missing something? Any ideas? Thanks, Steve On Fri, Sep 18, 2015 at 3:39 PM, Steve Howard <[email protected] <mailto:[email protected]> > wrote: I think ROWDEPENDENCIES on an Oracle table also covers this issue, so I don't think a separate JIRA is needed for the INITRANS change. On Fri, Sep 18, 2015 at 2:51 PM, Sergey Shelukhin <[email protected] <mailto:[email protected]> > wrote: There’s HIVE-11831 <https://issues.apache.org/jira/browse/HIVE-11831> and https://issues.apache.org/jira/browse/HIVE-11833 that try to address this. We can do a patch similar to the first one; can you file a JIRA? From: Steve Howard <[email protected] <mailto:[email protected]> > Reply-To: "[email protected] <mailto:[email protected]> " <[email protected] <mailto:[email protected]> > Date: Friday, September 18, 2015 at 10:54 To: "[email protected] <mailto:[email protected]> " <[email protected] <mailto:[email protected]> > Subject: ORA-8177 with Hive transactions While troubleshooting an issue with transactions shortly after enabling them, I noticed the following in an Oracle trace, which is our metastore for hive... ORA-8177: can't serialize access for this transaction These were thrown on "insert into HIVE_LOCKS..." Traditionally in Oracle, if an application actually needs serializable transactions, the fix is to to set initrans and maxtrans to the number of concurrent writers. When I ran what is below on a table similar to HIVE_LOCKS, this exception was thrown everywhere. The fix is to recreate the table with higher values for initrans (only 1 is the default for initrans, and 255 is the default for maxtrans). When I did this and re-ran what is below, the exceptions were no longer thrown. Does anyone have any feedback on this performance hint? The exceptions in hive are thrown from the checkRetryable method in the TxnHandler class, but I couldn't find what class.method throws them. Perhaps the exceptions are not impactful, but given the fact the method expects them as it checks for the string in the exception message, I thought I would ask for feedback before we recreate the HIVE_LOCKS table with a higher value for INITRANS. import java.sql.*;public class testLock implements Runnable { public static void main (String[] args) throws Exception { Class.forName("oracle.jdbc.driver.OracleDriver"); for (int i = 1; i <= 100; i++) { testLock tl = new testLock(); } } public testLock() { Thread t = new Thread(this); t.start(); } public void run() { try { Connection conn = DriverManager.getConnection("jdbc:oracle:thin:username/pwd@dbhost:1521/dbservice"); conn.createStatement().execute("alter session set isolation_level = serializable"); PreparedStatement pst = conn.prepareStatement("update test set a = ?"); for (int j = 1; j <= 10000; j++) { pst.setInt(1,j); pst.execute(); conn.commit(); System.out.println("worked"); } } catch (Exception e) { System.out.println(e.getMessage()); } }}
