in please look at the logs on other
> machines (maybe system logs)
> 3. Some OS failure - please look at the system logs on other machines
> 4. Some hardware failure (restart / crash)
> 5. Network problems
>
> Piotrek
>
> pon., 7 gru 2020 o 23:31 Kye Bae napisaĆ(a):
>
>
I forgot to mention: this is Flink 1.10.
-K
On Mon, Dec 7, 2020 at 5:08 PM Kye Bae wrote:
> Hello!
>
> We have a real-time streaming workflow that has been running for about 2.5
> weeks.
>
> Then, we began to get the exception below from taskmanagers (random) since
> y
Hello!
We have a real-time streaming workflow that has been running for about 2.5
weeks.
Then, we began to get the exception below from taskmanagers (random) since
yesterday, and the job began to fail/restart every hour or so.
The job does recover after each restart, but sometimes it takes more
>
>> I'll keep you up to date with my findings..
>>
>> Best,
>> Flavio
>>
>> On Mon, Nov 16, 2020 at 8:22 PM Kye Bae wrote:
>>
>>> Hello!
>>>
>>> The JVM metaspace is where all the classes (not class instances or
&
Hello!
The JVM metaspace is where all the classes (not class instances or objects)
get loaded. jmap -histo is going to show you the heap space usage info not
the metaspace.
You could inspect what is happening in the metaspace by using jcmd (e.g.,
jcmd JPID VM.native_memory summary) after restarti
Not sure about Flink 1.10.x. Can share a few things up to Flink 1.9.x:
1. If your Flink cluster runs only one job, avoid using dynamic classloader
for your job: start it from one of the Flink class paths. As of Flink
1.9.x, using the dynamic classloader results in the same classes getting
loaded e