All the new brokers are running 0.8.2.1 so I can only profile the new version and not the old one any more without reverting the change on some of the brokers. The restart of brokers causes clients to lose a select few messages so its not very desirable.
Profiling the new brokers using jVisualVm (don' have a better profiler in production) doesn't show anything very odd AFAICT. Here is a link to a particular profile: http://i.imgur.com/8T7jrTw.png Most of the time is spent on read calls. Thanks, Rajiv On Fri, Aug 21, 2015 at 4:22 PM, Tao Feng <fengta...@gmail.com> wrote: > Have you done a profiling on your broker process? Any hot code path > differences between these two versions? > > Thanks, > -Tao > > On Fri, Aug 21, 2015 at 3:59 PM, Rajiv Kurian <ra...@signalfuse.com> > wrote: > > > The only thing I notice in the logs which is a bit unsettling is about a > > once a second rate of messages of the type > > > > "Closing socket connection to some-ip-address". I used to see these > > messages before but it seems like its more often than usual. Also all the > > clients that it seems to close connections with are running the Java > > wrapper over the Scala SimpleConsumer. Is there any logging I can enable > to > > understand why exactly these connections are being closed so often? > > > > Thanks, > > > > Rajiv > > > > > > > > On Fri, Aug 21, 2015 at 3:50 PM, Rajiv Kurian <ra...@signalfuse.com> > > wrote: > > > > > We upgraded a 9 broker cluster from version 0.8.1 to version 0.8.2.1. > > > Actually we cherry-picked the commit > > > at 41ba26273b497e4cbcc947c742ff6831b7320152 to get zkClient 0.5 because > > we > > > ran into a bug described at > > > https://issues.apache.org/jira/browse/KAFKA-824 > > > > > > Right after the update the CPU spiked quite a bit but I am guessing > that > > > is because the brokers were pulling in log segments from other brokers > > > right after restart. The CPU remained elevated for a while and I > thought > > it > > > would come down after things settled down but the CPU has remained > higher > > > even after a day. > > > > > > Our steady state CPU on the brokers went from about 28% (0.8.1) to 34% > > > (0.8.2.1). We do not use compression on any topic or partition. Our > > > incoming traffic (number of messages/sec) has not increased at all. Our > > > incoming bytes/sec has actually decreased because we managed to reduce > > the > > > size of one our message types from 256 bytes to 32 bytes. The message > > size > > > change was made hours after the Kafka version update and didn't seem to > > > harm or help the cpu. The bytes-in/sec and bytes-out/sec metrics have > > > definitely gone down after the message size reduction. > > > > > > Here is a link to the graph showing how the CPU went up - > > > http://i.imgur.com/KVJLzsX.png?1 The restarts were done from 18:00 to > > > 19:00 and I'd expect the CPU to go up at that time but I can't explain > > the > > > steady state CPU rise. > > > > > > Are there any known performance regressions after 0.8.1? Any hints on > > what > > > I should investigate if you think that this is not normal? > > > > > > Thanks, > > > Rajiv > > > > > >