I agree with Stephan's points. Thanks for reporting and let's investigate this further.
To keep in mind: I think VisualVM is using hprof for CPU sampling, which has some known issues ( http://www.brendangregg.com/blog/2014-06-09/java-cpu-sampling-using-hprof.html). For one thing, it's profiling Java's RUNNABLE state, which does not necessarily correspond to a running Thread (in OS terms) consuming CPU. The select call (like epollWait()) keeps the Thread in this state. On Tue, May 5, 2015 at 9:23 PM, Stephan Ewen <se...@apache.org> wrote: > Hi! > > That does not sound right, I agree. Can you tell us a bit more? > > - What version of Flink are you using? > > - I assume the NIO loop is executed by a Netty thread. Can you tell us > whether it is from a "io.netty.*" thread, or a "org.jboss.netty.*" thread? > The former is from Flink's data network thread, the later from akka. > > - Is you job data heavy (data transfer is in progress most of the time), or > is it compute heavy (network is not fully utilized) > > Thanks for your help! > Stephan > Am 05.05.2015 16:52 schrieb "Kruse, Sebastian" <sebastian.kr...@hpi.de>: > > > Hi everyone, > > > > Everytime when I am running jvisualvm on one of the machines in our > > cluster during a Flink job, I see that NioEventLoop.select() is taking > 50% > > to 70% CPU self-time. I wonder how severe this is. It might be > busy-waiting > > time that cannot be filled otherwise, but I wanted to ask you if you also > > faced this issue and/or you know the cause of that circumstance. > > > > Cheers, > > Sebastian > > >