Ah, yeah, you're right. That is just wait time not CPU time. We should check that profile it must be something else on the list.
-Jay On Mon, Feb 2, 2015 at 9:33 AM, Jun Rao <j...@confluent.io> wrote: > Hi, Mathias, > > From the hprof output, it seems that the top CPU consumers are > socketAccept() and epollWait(). As far as I am aware, there hasn't been any > significant changes in the socket server code btw 0.8.1 and 0.8.2. Could > you run the same hprof test on 0.8.1 so that we can see the difference? > > Jaikiran, > > The fix you provided in probably unnecessary. The channel that we use in > SimpleConsumer (BlockingChannel) is configured to be blocking. So even > though the read from the socket is in a loop, each read blocks if there is > no bytes received from the broker. So, that shouldn't cause extra CPU > consumption. > > Thanks, > > Jun > > On Mon, Jan 26, 2015 at 10:05 AM, Mathias Söderberg < > mathias.soederb...@gmail.com> wrote: > > > Hi Neha, > > > > I sent an e-mail earlier today, but noticed now that it didn't actually > go > > through. > > > > Anyhow, I've attached two files, one with output from a 10 minute run and > > one with output from a 30 minute run. Realized that maybe I should've > done > > one or two runs with 0.8.1.1 as well, but nevertheless. > > > > I upgraded our staging cluster to 0.8.2.0-rc2, and I'm seeing the same > CPU > > usage as with the beta version (basically pegging all cores). If I manage > > to find the time I'll do another run with hprof on the rc2 version later > > today. > > > > Best regards, > > Mathias > > > > On Tue Dec 09 2014 at 10:08:21 PM Neha Narkhede <n...@confluent.io> > wrote: > > > >> The following should be sufficient > >> > >> java > >> -agentlib:hprof=cpu=samples,depth=100,interval=20,lineno= > >> y,thread=y,file=kafka.hprof > >> <classname> > >> > >> You would need to start the Kafka server with the settings above for > >> sometime until you observe the problem. > >> > >> On Tue, Dec 9, 2014 at 3:47 AM, Mathias Söderberg < > >> mathias.soederb...@gmail.com> wrote: > >> > >> > Hi Neha, > >> > > >> > Yeah sure. I'm not familiar with hprof, so any particular options I > >> should > >> > include or just run with defaults? > >> > > >> > Best regards, > >> > Mathias > >> > > >> > On Mon Dec 08 2014 at 7:41:32 PM Neha Narkhede <n...@confluent.io> > >> wrote: > >> > > >> > > Thanks for reporting the issue. Would you mind running hprof and > >> sending > >> > > the output? > >> > > > >> > > On Mon, Dec 8, 2014 at 1:25 AM, Mathias Söderberg < > >> > > mathias.soederb...@gmail.com> wrote: > >> > > > >> > > > Good day, > >> > > > > >> > > > I upgraded a Kafka cluster from v0.8.1.1 to v0.8.2-beta and > noticed > >> > that > >> > > > the CPU usage on the broker machines went up by roughly 40%, from > >> ~60% > >> > to > >> > > > ~100% and am wondering if anyone else has experienced something > >> > similar? > >> > > > The load average also went up by 2x-3x. > >> > > > > >> > > > We're running on EC2 and the cluster currently consists of four > >> > > m1.xlarge, > >> > > > with roughly 1100 topics / 4000 partitions. Using Java 7 (1.7.0_65 > >> to > >> > be > >> > > > exact) and Scala 2.9.2. Configurations can be found over here: > >> > > > https://gist.github.com/mthssdrbrg/7df34a795e07eef10262. > >> > > > > >> > > > I'm assuming that this is not expected behaviour for 0.8.2-beta? > >> > > > > >> > > > Best regards, > >> > > > Mathias > >> > > > > >> > > > >> > > > >> > > > >> > > -- > >> > > Thanks, > >> > > Neha > >> > > > >> > > >> > >> > >> > >> -- > >> Thanks, > >> Neha > >> > > >