[ https://issues.apache.org/jira/browse/FLINK-20155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17336030#comment-17336030 ]
Flink Jira Bot commented on FLINK-20155: ---------------------------------------- This issue was labeled "stale-major" 7 ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > java.lang.OutOfMemoryError: Direct buffer memory > ------------------------------------------------ > > Key: FLINK-20155 > URL: https://issues.apache.org/jira/browse/FLINK-20155 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission > Affects Versions: 1.11.1 > Reporter: roee hershko > Priority: Major > Labels: stale-major > Fix For: 1.11.1 > > Attachments: image-2020-11-13-17-52-54-217.png > > > update: > this issue occur every time after a job fails the only way to fix it is to > manually re-create the task managers pods (i am using flink operator) > > after submitting a job, it runs for few hours and then the job manager is > crushing, when trying to re-create the job i am getting the following error: > > {code:java} > 2020-11-13 17:44:58org.apache.pulsar.client.admin.PulsarAdminException: > org.apache.pulsar.shade.io.netty.handler.codec.EncoderException: > java.lang.OutOfMemoryError: Direct buffer memory at > org.apache.pulsar.client.admin.internal.BaseResource.getApiException(BaseResource.java:228) > at > org.apache.pulsar.client.admin.internal.TopicsImpl$7.failed(TopicsImpl.java:324) > at > org.apache.pulsar.shade.org.glassfish.jersey.client.JerseyInvocation$4.failed(JerseyInvocation.java:1030) > at > org.apache.pulsar.shade.org.glassfish.jersey.client.ClientRuntime.processFailure(ClientRuntime.java:231) > at > org.apache.pulsar.shade.org.glassfish.jersey.client.ClientRuntime.access$100(ClientRuntime.java:85) > at > org.apache.pulsar.shade.org.glassfish.jersey.client.ClientRuntime$2.lambda$failure$1(ClientRuntime.java:183) > at > org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors$1.call(Errors.java:272) > at > org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors$1.call(Errors.java:268) > at > org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors.process(Errors.java:316) > at > org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors.process(Errors.java:298) > at > org.apache.pulsar.shade.org.glassfish.jersey.internal.Errors.process(Errors.java:268) > at > org.apache.pulsar.shade.org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:312) > at > org.apache.pulsar.shade.org.glassfish.jersey.client.ClientRuntime$2.failure(ClientRuntime.java:183) > at > org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$3.onThrowable(AsyncHttpConnector.java:279) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.NettyResponseFuture.abort(NettyResponseFuture.java:277) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.request.WriteListener.abortOnThrowable(WriteListener.java:50) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.request.WriteListener.operationComplete(WriteListener.java:61) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.request.WriteCompleteListener.operationComplete(WriteCompleteListener.java:28) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.request.WriteCompleteListener.operationComplete(WriteCompleteListener.java:20) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:183) > at > org.apache.pulsar.shade.io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:95) > at > org.apache.pulsar.shade.io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:30) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.request.NettyRequestSender.writeRequest(NettyRequestSender.java:421) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.channel.NettyConnectListener.writeRequest(NettyConnectListener.java:80) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.channel.NettyConnectListener.onSuccess(NettyConnectListener.java:156) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.channel.NettyChannelConnector$1.onSuccess(NettyChannelConnector.java:92) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.SimpleChannelFutureListener.operationComplete(SimpleChannelFutureListener.java:26) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.SimpleChannelFutureListener.operationComplete(SimpleChannelFutureListener.java:20) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:604) > at > org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104) > at > org.apache.pulsar.shade.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) > at > org.apache.pulsar.shade.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:300) > at > org.apache.pulsar.shade.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:335) > at > org.apache.pulsar.shade.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) > at > org.apache.pulsar.shade.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) > at > org.apache.pulsar.shade.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) > at > org.apache.pulsar.shade.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) > at > org.apache.pulsar.shade.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) > at > org.apache.pulsar.shade.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at > org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Unknown Source)Caused by: > org.apache.pulsar.shade.io.netty.handler.codec.EncoderException: > java.lang.OutOfMemoryError: Direct buffer memory at > org.apache.pulsar.shade.io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107) > at > org.apache.pulsar.shade.io.netty.channel.CombinedChannelDuplexHandler.write(CombinedChannelDuplexHandler.java:346) > at > org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717) > at > org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:709) > at > org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:792) > at > org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:702) > at > org.apache.pulsar.shade.io.netty.handler.stream.ChunkedWriteHandler.doFlush(ChunkedWriteHandler.java:300) > at > org.apache.pulsar.shade.io.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:132) > at > org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:750) > at > org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:765) > at > org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:790) > at > org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:758) > at > org.apache.pulsar.shade.io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1020) > at > org.apache.pulsar.shade.io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:299) > at > org.apache.pulsar.shade.org.asynchttpclient.netty.request.NettyRequestSender.writeRequest(NettyRequestSender.java:420) > ... 23 moreCaused by: java.lang.OutOfMemoryError: Direct buffer memory > at java.base/java.nio.Bits.reserveMemory(Unknown Source) at > java.base/java.nio.DirectByteBuffer.<init>(Unknown Source) at > java.base/java.nio.ByteBuffer.allocateDirect(Unknown Source) at > org.apache.pulsar.shade.io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:758) > at > org.apache.pulsar.shade.io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:734) > at > org.apache.pulsar.shade.io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:245) > at > org.apache.pulsar.shade.io.netty.buffer.PoolArena.allocate(PoolArena.java:215) > at > org.apache.pulsar.shade.io.netty.buffer.PoolArena.allocate(PoolArena.java:147) > at > org.apache.pulsar.shade.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:356) > at > org.apache.pulsar.shade.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187) > at > org.apache.pulsar.shade.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178) > at > org.apache.pulsar.shade.io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:115) > at > org.apache.pulsar.shade.io.netty.handler.codec.http.HttpObjectEncoder.encode(HttpObjectEncoder.java:93) > at > org.apache.pulsar.shade.io.netty.handler.codec.http.HttpClientCodec$Encoder.encode(HttpClientCodec.java:167) > at > org.apache.pulsar.shade.io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:89) > ... 37 more > {code} > the only way to fix it is to restart all the task managers. > i also notice that even thought i configure 10gb memory, my flink managed > memory is much smaller: > !image-2020-11-13-17-52-54-217.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)