;> 4. Some hardware failure (restart / crash)
>>>> 5. Network problems
>>>>
>>>> Piotrek
>>>>
>>>> pon., 7 gru 2020 o 23:31 Kye Bae napisaĆ(a):
>>>>
>>>>> I forgot to mention: this is Flink 1.10.
>
>> Then, we began to get the exception below from taskmanagers (random)
>>>> since yesterday, and the job began to fail/restart every hour or so.
>>>>
>>>> The job does recover after each restart, but sometimes it takes more
>>>> time to recover t
in our environment. On a few occasions, it
>>> took more than a few restarts to fully recover.
>>>
>>> Can you provide some insight into what this error means and also what we
>>> can do to prevent this in future?
>>
gt;>
>> Can you provide some insight into what this error means and also what we
>> can do to prevent this in future?
>>
>> Thank you!
>>
>> +++
>> ERROR org.apache.flink.runtime.io.network.netty.PartitionRequestQueue -
>> Encountered error
insight into what this error means and also what we
> can do to prevent this in future?
>
> Thank you!
>
> +++
> ERROR org.apache.flink.runtime.io.network.netty.PartitionRequestQueue -
> Encountered error while consuming partitions
> java.io.IOException: Connection reset by pe
time
to recover than allowed in our environment. On a few occasions, it took
more than a few restarts to fully recover.
Can you provide some insight into what this error means and also what we
can do to prevent this in future?
Thank you!
+++
ERROR