Re: Flink and swapping question

Flavio Pompermaier Thu, 25 May 2017 01:22:34 -0700

I can confirm that after giving less memory to the Flink TM the job was
able to run successfully.
After almost 2 weeks of pain, we summarize here our experience with Fink in
virtualized environments (such as VMWare ESXi):


   1. Disable the virtualization "feature" that transfer a VM from a (heavy
   loaded) physical machine to another one (to balance the resource
   consumption)
   2. Check dmesg when a TM dies without logging anything (usually it goes
   OOM and the OS kills it but there you can find the log of this thing)
   3. CentOS 7 on ESXi seems to start swapping VERY early (in my case I see
   the OS starting swapping also if there are 12 out of 32 GB of free memory)!

We're still investigating how this behavior could be fixed: the problem is
that it's better not to disable swapping because otherwise VMWare could
start ballooning (that is definitely worse...).

I hope this tips could save someone else's day..

Best,
Flavio

On Wed, May 24, 2017 at 4:28 PM, Flavio Pompermaier <pomperma...@okkam.it>
wrote:

> Hi Greg, you were right! After typing dmsg I found "Out of memory: Kill
> process 13574 (java)".
> This is really strange because the JVM of the TM is very calm.
> Moreover, there are 7 GB of memory available (out of 32) but somehow the
> OS decides to start swapping and, when it runs out of available swap
> memory, the OS decides to kill the Flink TM :(
>
> Any idea of what's going on here?
>
> On Wed, May 24, 2017 at 2:32 PM, Flavio Pompermaier <pomperma...@okkam.it>
> wrote:
>
>> Hi Greg,
>> I carefully monitored all TM memory with jstat -gcutil and there'no full
>> gc, only .
>> The initial situation on the dying TM is:
>>
>>   S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT
>>   GCT
>>   0.00 100.00  33.57  88.74  98.42  97.17    159    2.508     1    0.255
>>    2.763
>>   0.00 100.00  90.14  88.80  98.67  97.17    197    2.617     1    0.255
>>    2.873
>>   0.00 100.00  27.00  88.82  98.75  97.17    234    2.730     1    0.255
>>    2.986
>>
>> After about 10 hours of processing is:
>>
>>   0.00 100.00  21.74  83.66  98.52  96.94   5519   33.011     1    0.255
>>   33.267
>>   0.00 100.00  21.74  83.66  98.52  96.94   5519   33.011     1    0.255
>>   33.267
>>   0.00 100.00  21.74  83.66  98.52  96.94   5519   33.011     1    0.255
>>   33.267
>>
>> So I don't think thta OOM could be an option.
>>
>> However, the cluster is running on ESXi vSphere VMs and we already
>> experienced unexpected crash of jobs because of ESXi moving a heavy-loaded
>> VM to another (less loaded) physical machine..I would't be surprised if
>> swapping is also handled somehow differently..
>> Looking at Cloudera widgets I see that the crash is usually preceded by
>> an intense cpu_iowait period.
>> I fear that Flink unsafe access to memory could be a problem in those
>> scenarios. Am I wrong?
>>
>> Any insight or debugging technique is  greatly appreciated.
>> Best,
>> Flavio
>>
>>
>> On Wed, May 24, 2017 at 2:11 PM, Greg Hogan <c...@greghogan.com> wrote:
>>
>>> Hi Flavio,
>>>
>>> Flink handles interrupts so the only silent killer I am aware of is
>>> Linux's OOM killer. Are you seeing such a message in dmesg?
>>>
>>> Greg
>>>
>>> On Wed, May 24, 2017 at 3:18 AM, Flavio Pompermaier <
>>> pomperma...@okkam.it> wrote:
>>>
>>>> Hi to all,
>>>> I'd like to know whether memory swapping could cause a taskmanager
>>>> crash.
>>>> In my cluster of virtual machines 'm seeing this strange behavior in my
>>>> Flink cluster: sometimes, if memory get swapped the taskmanager (on that
>>>> machine) dies unexpectedly without any log about the error.
>>>>
>>>> Is that possible or not?
>>>>
>>>> Best,
>>>> Flavio
>>>>
>>>
>>>
>

Re: Flink and swapping question

Reply via email to