[ https://issues.apache.org/jira/browse/HIVE-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084511#comment-17084511 ]
Hive QA commented on HIVE-19587: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12924738/HIVE-19587.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/21687/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/21687/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-21687/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2020-04-16 03:19:38.012 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-21687/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2020-04-16 03:19:38.049 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 1745e51 HIVE-23202 Remove unused q.out files, where the q file no longer exists (Miklos Gergely, reviewed by Laszlo Bodor) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 1745e51 HIVE-23202 Remove unused q.out files, where the q file no longer exists (Miklos Gergely, reviewed by Laszlo Bodor) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2020-04-16 03:19:47.116 + rm -rf ../yetus_PreCommit-HIVE-Build-21687 + mkdir ../yetus_PreCommit-HIVE-Build-21687 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-21687 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-21687/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Trying to apply the patch with -p0 error: a/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java: does not exist in index Trying to apply the patch with -p1 error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:158 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java' with conflicts. Going to apply patch with: git apply -p1 error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:158 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java' with conflicts. U ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java + result=1 + '[' 1 -ne 0 ']' + rm -rf yetus_PreCommit-HIVE-Build-21687 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12924738 - PreCommit-HIVE-Build > HeartBeat thread uses cancelled delegation token while connecting to meta on > KERBEROS cluster > --------------------------------------------------------------------------------------------- > > Key: HIVE-19587 > URL: https://issues.apache.org/jira/browse/HIVE-19587 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 3.0.0, 2.3.2, 3.1.0 > Reporter: Oleksiy Sayankin > Assignee: Oleksiy Sayankin > Priority: Blocker > Attachments: HIVE-19587.1.patch > > > *STEP 1. Create test data* > {code} > create table t1 (id int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ","; > create table t2 (id int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ","; > {code} > Generate 10 000 000 lines of random data > {code} > package com.test.app; > import java.io.FileNotFoundException; > import java.io.PrintWriter; > import java.util.concurrent.ThreadLocalRandom; > public class App { > public static void main(String[] args) throws FileNotFoundException { > try (PrintWriter out = new PrintWriter("table.data");) { > int min = 0; > int max = 10_000; > int numRows = 10_000_000; > for (int i = 0; i <= numRows - 1; i++){ > int randomNum = ThreadLocalRandom.current().nextInt(min, max + 1); > out.println(randomNum); > } > } > } > } > {code} > Upload data to Hive tables > {code} > load data local inpath '/home/myuser/table.data' into table t1; > load data local inpath '/home/myuser/table.data' into table t2; > {code} > *STEP 2. Configure transactions in hive-site.xml* > {code} > <!-- Transactions --> > <property> > <name>hive.exec.dynamic.partition.mode</name> > <value>nonstrict</value> > </property> > <property> > <name>hive.support.concurrency</name> > <value>true</value> > </property> > <property> > <name>hive.enforce.bucketing</name> > <value>true</value> > </property> > <property> > <name>hive.txn.manager</name> > <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value> > </property> > <property> > <name>hive.compactor.initiator.on</name> > <value>true</value> > </property> > <property> > <name>hive.compactor.worker.threads</name> > <value>1</value> > </property> > {code} > *STEP 3. Configure hive.txn.timeout in hive-site.xml* > {code} > <!-- Delegation token issue --> > <property> > <name>hive.txn.timeout</name> > <value>10s</value> > </property> > {code} > *STEP 4. Connect via beeline to HS2 with KERBEROS* > {code} > !connect > jdbc:hive2://node8.cluster:10000/default;principal=myuser/node8.cluster@NODE8;ssl=true;sslTrustStore=/opt/myuser/conf/ssl_truststore > {code} > {code} > select count(*) from t1; > {code} > *STEP 5. Close connection and reconnect* > {code} > !close > {code} > {code} > !connect > jdbc:hive2://node8.cluster:10000/default;principal=myuser/node8.cluster@NODE8;ssl=true;sslTrustStore=/opt/myuser/conf/ssl_truststore > {code} > *STEP 6. Perform long playing query* > This query lasts about 600s > {code} > select count(*) from t1 join t2 on t1.id = t2.id; > {code} > *EXPECTED RESULT* > Query finishes successfully > *ACTUAL RESULT* > {code} > 2018-05-17T13:54:54,921 ERROR [pool-7-thread-10] transport.TSaslTransport: > SASL negotiation failure > javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password > at > com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:598) > > at > com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:244) > at > org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539) > > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:663) > > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:660) > > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1613) > > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:660) > > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: token > expired or does not exist: owner=myuser, renewer=myuser, realUser=, > issueDate=1526565229297, maxDate=1527170029297, sequenceNumber=1, > masterKeyId=1 > at > org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:104) > > at > org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:56) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java:472) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java:503) > > at > com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:589) > ... 15 more > 2018-05-17T13:54:54,966 ERROR [pool-7-thread-10] server.TThreadPoolServer: > Error occurred during processing of message. > java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: > DIGEST-MD5: IO error acquiring password > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:663) > > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:660) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1613) > > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:660) > > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.thrift.transport.TTransportException: DIGEST-MD5: IO > error acquiring password > at > org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) > > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > > ... 10 more > {code} > *ROOT-CAUSE* > HS2 uses cancelled delegation token to connect to metastore. From logs > 1. Metastore creates delegation token (issueDate=1526565229297) > {code} > 2018-05-17T13:53:49,297 INFO [pool-7-thread-3] > delegation.AbstractDelegationTokenSecretManager: Creating password for > identifier: owner=myuser, renewer=myuser, realUser=, issueDate=1526565229297, > m$ > 2018-05-17T13:53:50,789 INFO [Thread-18] txn.TxnHandler: 'Cleaner' locked by > 'node8.cluster' > 2018-05-17T13:53:50,793 INFO [Thread-18] txn.TxnHandler: 'Cleaner' unlocked > by 'node8.cluster' > 2018-05-17T13:53:54,840 INFO [pool-7-thread-4] metastore.HiveMetaStore: 3: > Opening raw store with implementation > class:org.apache.hadoop.hive.metastore.ObjectStore > {code} > 2. Metastore cancels delegation token (issueDate=1526565229297) > {code} > 2018-05-17T13:54:32,380 INFO [pool-7-thread-4] > thrift.TokenStoreDelegationTokenSecretManager: Token cancelation requested > for identifier: owner=mapr, renewer=mapr, realUser=, issueDate=1526565229297, > maxDate=1527170029297,$ > 2018-05-17T13:54:32,416 INFO [pool-7-thread-4] metastore.HiveMetaStore: 3: > Cleaning up thread local RawStore... > {code} > 3. HS2 uses canceled delegation token and crashes (issueDate=1526565229297): > {code} > 2018-05-17T13:54:54,921 ERROR [pool-7-thread-10] transport.TSaslTransport: > SASL negotiation failure > javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password > at > com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:598) > > at > com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:244) > at > org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539) > > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:663) > > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:660) > > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1613) > > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:660) > > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: token > expired or does not exist: owner=myuser, renewer=myuser, realUser=, > issueDate=1526565229297, maxDate=1527170029297, sequenceNumber=1, > masterKeyId=1 > at > org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:104) > {code} > *WORKAROUND* > From Hive sources: > {code} > heartBeatInterval = HIVE_TXN_TIMEOUT / 2. > {code} > So to skip HB during job one needs to set > {code} > HIVE_TXN_TIMEOUT = (max_job_duration * 2) * 1.25. > {code} > Here I added 1.25 factor to be sure there is no HB. In my case > max_job_duration = ~600s, then > {code} > HIVE_TXN_TIMEOUT = 600s * 2 * 1.25 = 1500s > {code} > When I {{set hive.txn.timeout}} = 1500s issue was gone on my local env. > FYI [~isnogood], [~ashutoshc] -- This message was sent by Atlassian Jira (v8.3.4#803005)