[ https://issues.apache.org/jira/browse/HIVE-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148175#comment-15148175 ]
JoneZhang commented on HIVE-12649: ---------------------------------- The applications submited by estimate reducer number will fast fail when the reources is free after awhile. The error log is like below Container: container_1448873753366_121022_02_000001 on 10.239.243.69_8041 =========================================================================== LogType: stderr LogLength: 4284 Log Contents: Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled in the future Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled in the future 15/12/09 16:29:31 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] 15/12/09 16:29:32 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1448873753366_121022_000002 15/12/09 16:29:33 INFO spark.SecurityManager: Changing view acls to: mqq 15/12/09 16:29:33 INFO spark.SecurityManager: Changing modify acls to: mqq 15/12/09 16:29:33 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(mqq); users with modify permissions: Set(mqq) 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context initialization 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 15/12/09 16:29:33 INFO client.RemoteDriver: Connecting to: 10.179.12.140:38842 15/12/09 16:29:33 WARN rpc.Rpc: Invalid log level null, reverting to default. 15/12/09 16:29:33 ERROR yarn.ApplicationMaster: User class threw exception: java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: Client closed before SASL negotiation finished. java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: Client closed before SASL negotiation finished. at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) at org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156) at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483) Caused by: javax.security.sasl.SaslException: Client closed before SASL negotiation finished. at org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449) at org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) at org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769) at io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Thread.java:745) 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: Client closed before SASL negotiation finished.) 15/12/09 16:29:43 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 150000 ms. Please check earlier log output for errors. Failing the application. 15/12/09 16:29:43 INFO util.Utils: Shutdown hook called > Hive on Spark will resubmitted application when not enough resouces to launch > yarn application master > ----------------------------------------------------------------------------------------------------- > > Key: HIVE-12649 > URL: https://issues.apache.org/jira/browse/HIVE-12649 > Project: Hive > Issue Type: Bug > Affects Versions: 1.1.1, 1.2.1 > Reporter: JoneZhang > Assignee: Xuefu Zhang > > Hive on spark will estimate reducer number when the query is not set reduce > number,which cause a application submit.The application will pending if the > yarn queue's resources is insufficient. > So there are more than one pending applications probably because > there are more than one estimate call.The failure is soft, so it doesn't > prevent subsequent processings. We can make that a hard failure > That code is found in > at > org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:112) > at > org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:115) -- This message was sent by Atlassian JIRA (v6.3.4#6332)