I agree with Stephan's points. Thanks for reporting and let's investigate
this further.

To keep in mind: I think VisualVM is using hprof for CPU sampling, which
has some known issues (
http://www.brendangregg.com/blog/2014-06-09/java-cpu-sampling-using-hprof.html).
For one thing, it's profiling Java's RUNNABLE state, which does not
necessarily correspond to a running Thread (in OS terms) consuming CPU. The
select call (like epollWait()) keeps the Thread in this state.


On Tue, May 5, 2015 at 9:23 PM, Stephan Ewen <se...@apache.org> wrote:

> Hi!
>
> That does not sound right, I agree. Can you tell us a bit more?
>
> - What version of Flink are you using?
>
> - I assume the NIO loop is executed by a Netty thread. Can you tell us
> whether it is from a "io.netty.*" thread, or a "org.jboss.netty.*" thread?
> The former is from Flink's data network thread, the later from akka.
>
> - Is you job data heavy (data transfer is in progress most of the time), or
> is it compute heavy (network is not fully utilized)
>
> Thanks for your help!
> Stephan
>  Am 05.05.2015 16:52 schrieb "Kruse, Sebastian" <sebastian.kr...@hpi.de>:
>
> > Hi everyone,
> >
> > Everytime when I am running jvisualvm on one of the machines in our
> > cluster during a Flink job, I see that NioEventLoop.select() is taking
> 50%
> > to 70% CPU self-time. I wonder how severe this is. It might be
> busy-waiting
> > time that cannot be filled otherwise, but I wanted to ask you if you also
> > faced this issue and/or you know the cause of that circumstance.
> >
> > Cheers,
> > Sebastian
> >
>

Reply via email to