Hi Igor, Have you started the external shuffle service manually?
Cheers 2018-04-12 10:48 GMT+02:00 igor.berman <igor.ber...@gmail.com>: > Hi, > any input regarding is it expected: > Driver starts and unable to connect to external shuffle service on one of > the nodes(no matter what is the reason) > This makes framework to go to Inactive mode in Mesos UI > However it seems that driver doesn't exits and continues to execute > tasks(or > tries to). The attached stacktrace below shows few lines around the > connection error and aborting message > > The question is is it expected behaviour? > > Here is stacktracke > > I0412 07:31:25.827283 274 sched.cpp:759] Framework registered with > 15d9838f-b266-413b-842d-f7c3567bd04a-0051 > Exception in thread "Thread-295" java.io.IOException: Failed to connect to > my-company.com/x.x.x.x:7337 > at > org.apache.spark.network.client.TransportClientFactory.createClient( > TransportClientFactory.java:232) > at > org.apache.spark.network.client.TransportClientFactory.createClient( > TransportClientFactory.java:182) > at > org.apache.spark.network.shuffle.mesos.MesosExternalShuffleClient. > registerDriverWithShuffleService(MesosExternalShuffleClient.java:75) > at > org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBac > kend.statusUpdate(MesosCoarseGrainedSchedulerBackend.scala:537) > Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: > Connection refused:my-company.com/x.x.x.x:7337 > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect( > NioSocketChannel.java:257) > at > io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect( > AbstractNioChannel.java:291) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey( > NioEventLoop.java:631) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized( > NioEventLoop.java:566) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys( > NioEventLoop.java:480) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2. > run(SingleThreadEventExecutor.java:131) > at > io.netty.util.concurrent.DefaultThreadFactory$ > DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) > at java.lang.Thread.run(Thread.java:748) > I0412 07:35:12.032925 277 sched.cpp:2055] Asked to abort the driver > I0412 07:35:12.033035 277 sched.cpp:1233] Aborting framework > 15d9838f-b266-413b-842d-f7c3567bd04a-0051 > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >