Perhaps you can see if you can reproduce this issue and then file a jira. Thanks,
Jun On Fri, Jan 16, 2015 at 4:39 PM, svante karlsson <s...@csi.se> wrote: > Hmm, produce "msg/sec in rate" seems to be per broker and "produce msg/per > sec" should also be per broker and thus be related. The problem is that for > a time period the graphs indicated that 1) messages where only produced to > one broker 2) messages where produces to two brokers. > > when I restarted brokers everything looked normal again. I made no changes > to the parts that were collecting the metrics during this time. This is of > course hearsay since I can't repeat it but at least the graphs supports the > view that something is strange. > > I agree that the value looks ok for most ("all") of the time but I suspect > that there might be issues here. > > /svante > > 2015-01-17 0:19 GMT+01:00 Jun Rao <j...@confluent.io>: > > > I did some quick tests and the mbean values look reasonable. On the > > producer side, produce msg/sec is actually for all brokers. > > > > Thanks, > > > > Jun > > > > On Fri, Jan 16, 2015 at 12:09 PM, svante karlsson <s...@csi.se> wrote: > > > > > Disregard the previous message, it was send accidently.. > > > > > > Jun, > > > > > > I don't know if it was an issue with graphite or the mbean and have not > > > seen it since - and we have tried of several cases of failover. > > > > > > That said, I have the feeling that it was a kafka issue and I'm a bit > > > suspicious about the new mbeans. > > > > > > I attach a screenshot from the grafana dashboard and if you look at the > > > first graph (top left) at ~10% it shows the startup after upgrade. > > > > > > This is a 3 node cluster with a topic of 2 partitions. When we start > up a > > > single producer it produces messages to both partitions without message > > > loss. I know that all messages are acked. > > > > > > If you look at the "produce msg/sec" graph it seems to hit 2 servers > > (it's > > > per broker) but "messages in rate" & "byte in rate",& "byte out rate" > > (all > > > from the new mbeans) look as if the data only hits one broker. (those > are > > > also per broker) > > > > > > At 70% I restarted two brokers after each other. After that point all > > three > > > graphs looks fine. > > > > > > I'm not at work now and can't dig into the graphite data but I now see > > that > > > the "fetch follower" also looks strange > > > > > > I can't file it as a bug report as I can't reproduce it but I have a > > > distinct feeling that I can't trust the new mbeans or have to find > > another > > > explanation. > > > > > > regard it as an observation if someone else reports issues. > > > > > > > > > thanks, > > > > > > svante > > > > > > 2015-01-16 20:56 GMT+01:00 svante karlsson <s...@csi.se>: > > > > > > > Jun, > > > > > > > > I don't know if it was an issue with graphite or the mbean but I have > > not > > > > seen it since - and we have tried of several cases of failover and > this > > > > problem has only been seen once. > > > > > > > > That said, I have the feeling that it was a kafka issue and I'm a bit > > > > suspicious about the new mbeans. > > > > > > > > I attach a screenshot from the grafana dashboard and if you look at > the > > > > first graph (top left) at ~10% it shows the startup after upgrade. > > > > > > > > This is a 3 node cluster with a topic of 2 partitions. When we start > > up a > > > > single producer it produces messages to both partitions without > message > > > > loss. I know that all messages are acked. > > > > > > > > If you look at the "produce message msg/sec" graph it seems to hit 2 > > > > servers (it's per broker) > > > > > > > > > > > > Bad picture but > > > > > > > > 2015-01-16 18:05 GMT+01:00 Jun Rao <j...@confluent.io>: > > > > > > > >> Svante, > > > >> > > > >> I tested this out locally and the mbeans for those metrics do show > up > > on > > > >> startup. Can you reproduce the issue reliably? Also, is what you saw > > an > > > >> issue with the mbean itself or graphite? > > > >> > > > >> Thanks, > > > >> > > > >> Jun > > > >> > > > >> On Fri, Jan 16, 2015 at 4:38 AM, svante karlsson <s...@csi.se> > wrote: > > > >> > > > >> > I upgrade two small test cluster and I had two small issues but > I'm, > > > not > > > >> > clear yet as to if those were an issue due to us using ansible to > > > >> configure > > > >> > and deploy the cluster. > > > >> > > > > >> > The first issue could be us doing something bad when distributing > > the > > > >> > update (I updated, not reinstalled) but it should be easy for you > to > > > >> > disregard since it seems so trivial. > > > >> > > > > >> > We replace the kafka-server-start.sh with something else but we > had > > > the > > > >> > line > > > >> > > > > >> > EXTRA_ARGS="-name kafkaServer -loggc" > > > >> > > > > >> > then kafka-run-class.sh exits without starting the VM and > complains > > on > > > >> > unknown options. ( both -name and -loggc ) - once we removed the > > > >> EXTRA_ARGS > > > >> > everything starts. > > > >> > > > > >> > as I said - everyone should have this issue if it was a problem... > > > >> > > > > >> > > > > >> > The second thing is regarding the jmx beans. I reconfigured our > > > graphite > > > >> > monitoring and noticed that the following metrics stopped working > on > > > one > > > >> > broker > > > >> > - server.BrokerTopicMetrics.MessagesInPerSec.OneMinuteRate, > > > >> > - server.BrokerTopicMetrics.ByteInPerSec.OneMinuteRate, > > > >> > - server.BrokerTopicMetrics.ByteOutPerSec.OneMinuteRate > > > >> > > > > >> > I had graphs running and the it looked like the traffic was > dropping > > > on > > > >> > those metrics but our producers was working without problems and > the > > > >> > metrics > > > >> > network.RequestMetrics.Produce.RequestsPerSec.OneMinuteRate > > confirmed > > > >> that > > > >> > on all brokers. > > > >> > > > > >> > A restart of the offending broker brought the metrics online > again. > > > >> > > > > >> > /svante > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > 2015-01-16 3:42 GMT+01:00 Gwen Shapira <gshap...@cloudera.com>: > > > >> > > > > >> > > Would make sense to enable it after we have authorization > feature > > > and > > > >> > > admins can control who can delete what. > > > >> > > > > > >> > > On Thu, Jan 15, 2015 at 6:32 PM, Jun Rao <j...@confluent.io> > > wrote: > > > >> > > > Yes, I agree it's probably better not to enable > > > >> "delete.topic.enable" > > > >> > by > > > >> > > > default. > > > >> > > > > > > >> > > > Thanks, > > > >> > > > > > > >> > > > Jun > > > >> > > > > > > >> > > > On Thu, Jan 15, 2015 at 6:29 PM, Joe Stein < > > joe.st...@stealth.ly> > > > >> > wrote: > > > >> > > > > > > >> > > >> I think that is a change of behavior that organizations may > get > > > >> burned > > > >> > > on. > > > >> > > >> Right now there is no delete data feature. If an operations > > teams > > > >> > > upgrades > > > >> > > >> to 0.8.2 and someone decides to delete a topic then there > will > > be > > > >> data > > > >> > > >> loss. The organization may not have wanted that to happen. I > > > would > > > >> > > argue to > > > >> > > >> not have a way to "by default" delete data. There is > something > > > >> > > actionable > > > >> > > >> about consciously turning on a feature that allows anyone > with > > > >> access > > > >> > to > > > >> > > >> kafka-topics (or zookeeper for that matter) to delete Kafka > > data. > > > >> If > > > >> > > folks > > > >> > > >> want that feature then flip the switch prior to upgrade or > > after > > > >> and > > > >> > > >> rolling restart and have at it. By not setting it as default > > they > > > >> will > > > >> > > know > > > >> > > >> they have to turn it on and figure out what they need to-do > > from > > > a > > > >> > > security > > > >> > > >> perspective (until Kafka gives them that) to protect their > data > > > >> > (through > > > >> > > >> network or other type of measures). > > > >> > > >> > > > >> > > >> On Thu, Jan 15, 2015 at 8:24 PM, Manikumar Reddy < > > > >> > ku...@nmsworks.co.in> > > > >> > > >> wrote: > > > >> > > >> > > > >> > > >> > Also can we remove "delete.topic.enable" config property > and > > > >> enable > > > >> > > topic > > > >> > > >> > deletion by default? > > > >> > > >> > On Jan 15, 2015 10:07 PM, "Jun Rao" <j...@confluent.io> > > wrote: > > > >> > > >> > > > > >> > > >> > > Thanks for reporting this. I will remove that option in > > RC2. > > > >> > > >> > > > > > >> > > >> > > Jun > > > >> > > >> > > > > > >> > > >> > > On Thu, Jan 15, 2015 at 5:21 AM, Jaikiran Pai < > > > >> > > >> jai.forums2...@gmail.com> > > > >> > > >> > > wrote: > > > >> > > >> > > > > > >> > > >> > > > I just downloaded the Kafka binary and am trying this > on > > my > > > >> 32 > > > >> > bit > > > >> > > >> JVM > > > >> > > >> > > > (Java 7)? Trying to start Zookeeper or Kafka server > keeps > > > >> > failing > > > >> > > >> with > > > >> > > >> > > > "Unrecognized VM option 'UseCompressedOops'": > > > >> > > >> > > > > > > >> > > >> > > > ./zookeeper-server-start.sh > > ../config/zookeeper.properties > > > >> > > >> > > > Unrecognized VM option 'UseCompressedOops' > > > >> > > >> > > > Error: Could not create the Java Virtual Machine. > > > >> > > >> > > > Error: A fatal exception has occurred. Program will > exit. > > > >> > > >> > > > > > > >> > > >> > > > Same with the Kafka server startup scripts. My Java > > version > > > >> is: > > > >> > > >> > > > > > > >> > > >> > > > java version "1.7.0_71" > > > >> > > >> > > > Java(TM) SE Runtime Environment (build 1.7.0_71-b14) > > > >> > > >> > > > Java HotSpot(TM) Server VM (build 24.71-b01, mixed > mode) > > > >> > > >> > > > > > > >> > > >> > > > Should there be a check in the script, before adding > this > > > >> > option? > > > >> > > >> > > > > > > >> > > >> > > > -Jaikiran > > > >> > > >> > > > > > > >> > > >> > > > On Wednesday 14 January 2015 10:08 PM, Jun Rao wrote: > > > >> > > >> > > > > > > >> > > >> > > >> + users mailing list. It would be great if people can > > test > > > >> this > > > >> > > out > > > >> > > >> > and > > > >> > > >> > > >> report any blocker issues. > > > >> > > >> > > >> > > > >> > > >> > > >> Thanks, > > > >> > > >> > > >> > > > >> > > >> > > >> Jun > > > >> > > >> > > >> > > > >> > > >> > > >> On Tue, Jan 13, 2015 at 7:16 PM, Jun Rao < > > > j...@confluent.io> > > > >> > > wrote: > > > >> > > >> > > >> > > > >> > > >> > > >> This is the first candidate for release of Apache > Kafka > > > >> > 0.8.2.0. > > > >> > > >> > There > > > >> > > >> > > >>> has been some changes since the 0.8.2 beta release, > > > >> especially > > > >> > > in > > > >> > > >> the > > > >> > > >> > > new > > > >> > > >> > > >>> java producer api and jmx mbean names. It would be > > great > > > if > > > >> > > people > > > >> > > >> > can > > > >> > > >> > > >>> test > > > >> > > >> > > >>> this out thoroughly. We are giving people 10 days for > > > >> testing > > > >> > > and > > > >> > > >> > > voting. > > > >> > > >> > > >>> > > > >> > > >> > > >>> Release Notes for the 0.8.2.0 release > > > >> > > >> > > >>> *https://people.apache.org/~junrao/kafka-0.8.2.0- > > > >> > > >> > > >>> candidate1/RELEASE_NOTES.html > > > >> > > >> > > >>> <https://people.apache.org/~junrao/kafka-0.8.2.0- > > > >> > > >> > > >>> candidate1/RELEASE_NOTES.html>* > > > >> > > >> > > >>> > > > >> > > >> > > >>> *** Please download, test and vote by Friday, Jan > 23h, > > > 7pm > > > >> PT > > > >> > > >> > > >>> > > > >> > > >> > > >>> Kafka's KEYS file containing PGP keys we use to sign > > the > > > >> > > release: > > > >> > > >> > > >>> * > > > >> > > https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/KEYS > > > >> > > >> > > >>> < > > > >> > > https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/KEYS > >* > > > >> > > >> > in > > > >> > > >> > > >>> addition to the md5, sha1 > > > >> > > >> > > >>> and sha2 (SHA256) checksum. > > > >> > > >> > > >>> > > > >> > > >> > > >>> * Release artifacts to be voted upon (source and > > binary): > > > >> > > >> > > >>> * > > > >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/ > > > >> > > >> > > >>> < > > > >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/ > > > >> > >* > > > >> > > >> > > >>> > > > >> > > >> > > >>> * Maven artifacts to be voted upon prior to release: > > > >> > > >> > > >>> *https://people.apache.org/~junrao/kafka-0.8.2.0- > > > >> > > >> > > >>> candidate1/maven_staging/ > > > >> > > >> > > >>> <https://people.apache.org/~junrao/kafka-0.8.2.0- > > > >> > > >> > > >>> candidate1/maven_staging/>* > > > >> > > >> > > >>> > > > >> > > >> > > >>> * scala-doc > > > >> > > >> > > >>> *https://people.apache.org/~junrao/kafka-0.8.2.0- > > > >> > > >> > > >>> candidate1/scaladoc/#package > > > >> > > >> > > >>> <https://people.apache.org/~junrao/kafka-0.8.2.0- > > > >> > > >> > > >>> candidate1/scaladoc/#package>* > > > >> > > >> > > >>> > > > >> > > >> > > >>> * java-doc > > > >> > > >> > > >>> * > > > >> > > >> > > > >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/javadoc/ > > > >> > > >> > > >>> < > > > >> > > >> > > > >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/javadoc/ > > > >> > > >> > >* > > > >> > > >> > > >>> > > > >> > > >> > > >>> * The tag to be voted upon (off the 0.8.2 branch) is > > the > > > >> > 0.8.2.0 > > > >> > > >> tag > > > >> > > >> > > >>> * > > > >> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h= > > > >> > > >> > > >>> b0c7d579f8aeb5750573008040a42b7377a651d5 > > > >> > > >> > > >>> < > > > >> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h= > > > >> > > >> > > >>> b0c7d579f8aeb5750573008040a42b7377a651d5>* > > > >> > > >> > > >>> > > > >> > > >> > > >>> /******************************************* > > > >> > > >> > > >>> > > > >> > > >> > > >>> Thanks, > > > >> > > >> > > >>> > > > >> > > >> > > >>> Jun > > > >> > > >> > > >>> > > > >> > > >> > > >>> > > > >> > > >> > > > > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > > > > > > > > >