zhu created FLINK-33014:
---------------------------
Summary: flink jobmanager raise java.io.IOException: Connection
reset by peer
Key: FLINK-33014
URL: https://issues.apache.org/jira/browse/FLINK-33014
Project: Flink
Issue Type: Bug
Affects Versions: 1.17.1
Environment: |*blob.server.port*|6124|
|*classloader.resolve-order*|parent-first|
|*jobmanager.execution.failover-strategy*|region|
|*jobmanager.memory.heap.size*|2228014280b|
|*jobmanager.memory.jvm-metaspace.size*|536870912b|
|*jobmanager.memory.jvm-overhead.max*|322122552b|
|*jobmanager.memory.jvm-overhead.min*|322122552b|
|*jobmanager.memory.off-heap.size*|134217728b|
|*jobmanager.memory.process.size*|3gb|
|*jobmanager.rpc.address*|naf-flink-ms-flink-manager-1-59m7w|
|*jobmanager.rpc.port*|6123|
|*parallelism.default*|1|
|*query.server.port*|6125|
|*rest.address*|0.0.0.0|
|*rest.bind-address*|0.0.0.0|
|*rest.connection-timeout*|60000|
|*rest.server.numThreads*|8|
|*slot.request.timeout*|3000000|
|*state.backend.rocksdb.localdir*|/home/nafplat/data/flinkStateStore|
|*state.backend.type*|rocksdb|
|*taskmanager.bind-host*|0.0.0.0|
|*taskmanager.host*|0.0.0.0|
|*taskmanager.memory.framework.off-heap.batch-shuffle.size*|256mb|
|*taskmanager.memory.framework.off-heap.size*|512mb|
|*taskmanager.memory.managed.fraction*|0.4|
|*taskmanager.memory.network.fraction*|0.2|
|*taskmanager.memory.process.size*|5gb|
|*taskmanager.memory.task.off-heap.size*|268435456bytes|
|*taskmanager.numberOfTaskSlots*|2|
|*taskmanager.runtime.large-record-handler*|true|
|*web.submit.enable*|true|
|*web.tmpdir*|/tmp/flink-web-c1b57e2b-5426-4fb8-a9ce-5acd1cceefc9|
|*web.upload.dir*|/opt/flink/nafJar|
Reporter: zhu
The Flink cluster was deployed using the Docker image of Flink 1.17.1 java8.
After deployment, on k8s, in standalone form, jobmanager printed this error at
intervals, and taskmanager did not print any errors,
There are currently no jobs running
{code:java}
2023-09-01 11:34:14,293 WARN
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint [] - Unhandled
exceptionjava.io.IOException: Connection reset by peer at
sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_372] at
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_372] at
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_372] at
sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_372] at
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) ~[?:1.8.0_372]
at
org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:258)
~[flink-dist-1.17.1.jar:1.17.1] at
org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
~[flink-dist-1.17.1.jar:1.17.1] at
org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:357)
~[flink-dist-1.17.1.jar:1.17.1] at
org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)
[flink-dist-1.17.1.jar:1.17.1] at
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
[flink-dist-1.17.1.jar:1.17.1] at
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
[flink-dist-1.17.1.jar:1.17.1] at
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
[flink-dist-1.17.1.jar:1.17.1] at
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
[flink-dist-1.17.1.jar:1.17.1] at
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
[flink-dist-1.17.1.jar:1.17.1] at
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
[flink-dist-1.17.1.jar:1.17.1] at java.lang.Thread.run(Thread.java:750)
[?:1.8.0_372] {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)