[Stage 284:============>(199 + 1) / 200][Stage 292:> (1 + 3) / 200] [Stage 284:============>(199 + 1) / 200][Stage 292:> (2 + 3) / 200] [Stage 292:> (2 + 4) / 200][14/06/21 10:46:17,006 WARN shuffle-server-4](TransportChannelHandler) Exception in connection from <hostname> java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:748) [14/06/21 10:46:17,010 ERROR shuffle-server-4](TransportResponseHandler) Still have 1 requests outstanding when connection from <hostname> is closed [14/06/21 10:46:17,012 ERROR Spark Context Cleaner](ContextCleaner) Error cleaning broadcast 159 java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:748) [14/06/21 10:46:17,012 WARN block-manager-ask-thread-pool-69](BlockManagerMaster) Failed to remove broadcast 159 with removeFromMaster = true - Connection reset by peer java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:748) ============================================ On 2021/07/29 12:46:01, Big data developer need help relat to spark gateway roles in 2.0 <harshsharma21...@gmail.com> wrote: > Hi Team , > > We are facing issue in production where we are getting frequent > > Still have 1 request outstanding when connection with the hostname was closed > > connection reset by peer : errors as well as warnings : failed to remove cache > rdd or failed to remove broadcast variable. > > Please help us how to mitigate this : > > Executor memory : 12g > > Network timeout : 600000 > > Heartbeat interval : 250000 > > > > > > > [](https://www.avast.com/sig- > email?utm_medium=email&utm_source=link&utm_campaign=sig- > email&utm_content=emailclient) | Virus-free. > [www.avast.com](https://www.avast.com/sig- > email?utm_medium=email&utm_source=link&utm_campaign=sig- > email&utm_content=emailclient) > ---|--- > > \--------------------------------------------------------------------- To > unsubscribe e-mail: user-unsubscr...@spark.apache.org > >
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org