With ack=1, acked messages could be lost when the leader fails. Thanks,
Jun On Mon, Sep 29, 2014 at 8:04 AM, Andrew Otto <o...@wikimedia.org> wrote: > This happened again to me this weekend. I've done some sleuthing, and > definitely can see some crazy paging stats when this lock up happens. For > the curious, more info can be found here: > https://bugzilla.wikimedia.org/show_bug.cgi?id=69667. I had tuned > dirty_expire_centisecs from 30 seconds to 10, but this still happened > again. I'll continue to troubleshoot and tune. This slow going because it > is not regularly reproducible. I have to make a single change and then > wait a week or two for the timeout to occur. > > Here's a related question. When the timeout happens, we lose some > messages. Our producer is varnishkafka, which uses the librdkafka producer > client. librdkafka keeps track of produce errors. We > have kafka.topic.request.required.acks = 1. According to librdkafka, all > messages sent have been ACKed by the leader of the partition to which the > messages are sent. Also, when we lose messages due to this timeout, the > broker that times out is always the controller. When it attempts to > reconnect to Zookeeper, we see: > > INFO kafka.utils.ZkUtils$ - conflict in /controller data: > {"version":1,"brokerid":21,"timestamp":"1411879981756"} stored data: > {"version":1,"brokerid":22,"timestamp":"1407187809296"} > > In the case when a controller drops out of the ISR for a few seconds, is it > possible for this confused broker to drop ACKed messages? > > > > On Thu, Jul 3, 2014 at 12:48 AM, Jun Rao <jun...@gmail.com> wrote: > > > Are you on Linux? We have seen this pattern (user/sys time low and real > > time high in GC time) before. In our case, the problem was due to disk > > I/Os. When there are lots of dirty pages (in our case, this is caused by > > log4j logging), Linux can draft user threads (in this case GC threads) to > > flush the dirty pages. So, all those time in real was spent on disk I/Os, > > rather than real GCs. The fix is to tune dirty_expire_centisecs and > > dirty_writeback_centisecs > > to flush dirty pages more frequently to avoid such drafting. > > > > Thanks, > > > > Jun > > > > > > On Wed, Jul 2, 2014 at 1:32 PM, Andrew Otto <o...@wikimedia.org> wrote: > > > > > Hi again! > > > > > > I've been having this issue consistently since I first started this > > thread, > > > but it was happening infrequently enough for me to brush it aside and > > just > > > run an election to rebalance brokers. > > > > > > I recently expanded (and reinstalled) our Kafka cluster so that it now > > has > > > 4 brokers with a default replication factor of 3 for each partition. I > > > also switched over to the G1GC as recommended here: > > > https://kafka.apache.org/081/ops.html (even though we are still > running > > > Kafka 0.8.0; we hope to upgrade soon). > > > > > > Now, only one of the 4 brokers (analytics1021, the same problem broker > we > > > saw before) gets its ZK connection expired even more frequently. > > > Previously it was less than once a week, now I am seeing this happen > > > multiple times a day. > > > > > > I've posted all the relevant logs from a recent event here: > > > https://gist.github.com/ottomata/e42480446c627ea0af22 > > > > > > This includes the GC log on the offending Kafka broker during the time > > this > > > happened. I am pretty green when it comes to GC tuning, but I do see > > this > > > interesting stat: > > > > > > > > > [Times: user=0.14 sys=0.00, real=11.47 secs] > > > > > > Did Kafka's JVM really just take 11.47 secs to do a GC there? I'm > > > probably missing something, but I don't see which part of that real > > > time summary makes up the bulk of that GC time > > > > > > This is strange, riight? This broker is identically configured to all > > > its peers, and should be handling on average the exact same amount and > > > type of traffic. Anyone have any advice? > > > > > > Thanks! > > > -Andrew Otto > > > > > > > > > > > > > > > On Fri, Mar 21, 2014 at 6:48 PM, Neha Narkhede < > neha.narkh...@gmail.com> > > > wrote: > > > > > > > I see, that makes sense. Thanks a lot for clarifying! > > > > > > > > -Neha > > > > > > > > > > > > On Fri, Mar 21, 2014 at 11:01 AM, Bae, Jae Hyeon <metac...@gmail.com > > > > > > wrote: > > > > > > > > > Let me clarify the situation. I forgot to mention that my case > might > > > not > > > > be > > > > > general one because Netflix is using Apache Curator as the main > > > zookeeper > > > > > client and ZkClient in Kafka should be bridged to Apache Curator, > so > > > the > > > > > behavior I have seen might not be general one. > > > > > > > > > > Kafka's ZKSessionExpireListener.handleNewSession() is reinstating > all > > > > > ephemeral nodes and watchers but handleNewSession() was not kicked > in > > > my > > > > > case. So, I created Netflix internal version of ZkClient to replace > > > > > ephemeral node creation and watcher reinstating. > > > > > > > > > > I have a plan to remove all external dependency from Kafka soon > and I > > > > might > > > > > be able to mention more about Kafka's zk resiliency. > > > > > > > > > > > > > > > On Fri, Mar 21, 2014 at 7:51 AM, Jun Rao <jun...@gmail.com> wrote: > > > > > > > > > > > Hi, Jae, > > > > > > > > > > > > My understanding is that ephemeral nodes can only be lost on ZK > > > session > > > > > > expiration. Are you saying that there are other cases when > > ephemeral > > > > > nodes > > > > > > can be lost? > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > On Thu, Mar 20, 2014 at 9:52 PM, Bae, Jae Hyeon < > > metac...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > This issue is zookeeper resiliency. > > > > > > > > > > > > > > What I have done is, ephemeral node creation is replaced by > > Apache > > > > > > > Curator's PersistentEphemeralNode recipe, to reinstate > ephemeral > > > > nodes > > > > > > > after zookeeper blip. Also, all watchers also should be > > reinstated. > > > > > Kafka > > > > > > > internally only handles session expired event but there are > many > > > bad > > > > > > > situations of zookeepers where Kafka can go haywire. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Mar 20, 2014 at 9:23 PM, Neha Narkhede < > > > > > neha.narkh...@gmail.com > > > > > > > >wrote: > > > > > > > > > > > > > > > 1. What does 'zookeeper state changed (Expired)' mean? > > > > > > > > > > > > > > > > 2. Has anyone seen issues like this before? Where zookeeper > > > > > > connections > > > > > > > > are flaky enough to cause leader elections? > > > > > > > > > > > > > > > > It means zookeeper expired the session. The most common > reason > > > for > > > > > this > > > > > > > is > > > > > > > > client side GC (in your case, client is the Kafka broker) > > though > > > if > > > > > > your > > > > > > > > zookeeper setup is flaky, this might even point to a server > > side > > > > I/O > > > > > > > > bottleneck or GC. Could you inspect the broker and > zookeeper's > > gc > > > > > logs? > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Mar 20, 2014 at 3:26 PM, Andrew Otto < > > o...@wikimedia.org > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hm, just saw something a little fishy. > > > > > > > > > > > > > > > > > > (In the following logs, analytics1021 (id 21) and > > analytics1022 > > > > (id > > > > > > 22) > > > > > > > > > are Brokers and analytics1023,analytics1024,analytics1025 > are > > > > > > > > Zookeepers.) > > > > > > > > > > > > > > > > > > At 2014-03-20 21:12:26, analytics1021 lost its connection > to > > > > > > zookeeper. > > > > > > > > It > > > > > > > > > reconnected to analytics1023, but as it was doing so, it > > > logged a > > > > > > > > > 'zookeeper state changed (Expired)' message, which > apparently > > > > > caused > > > > > > > the > > > > > > > > > newly opened socket to close again. Am I reading that > right? > > > > > > > > > > > > > > > > > > In the time that it took analytics1021 to finally > > re-establish > > > a > > > > > > > > zookeeper > > > > > > > > > connection, analytics1022 had become the leader for all > > > > partitions. > > > > > > I > > > > > > > > can > > > > > > > > > rebalance the leaders, but I'd prefer if this didn't happen > > in > > > > the > > > > > > > first > > > > > > > > > place. > > > > > > > > > > > > > > > > > > 1. What does 'zookeeper state changed (Expired)' mean? > > > > > > > > > 2. Has anyone seen issues like this before? Where > zookeeper > > > > > > > connections > > > > > > > > > are flaky enough to cause leader elections? > > > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > -Andrew Otto > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2014-03-20 21:12:26,427] 685698013 > > > > > > > > > [main-SendThread(analytics1025.eqiad.wmnet:2181)] INFO > > > > > > > > > org.apache.zookeeper.ClientCnxn - Client session timed out, > > > have > > > > > not > > > > > > > > heard > > > > > > > > > from server in 10067ms for sessionid 0x1425d903a3f1f89, > > closing > > > > > > socket > > > > > > > > > connection and attempting reconnect > > > > > > > > > [2014-03-20 21:12:26,528] 685698114 [main-EventThread] INFO > > > > > > > > > org.I0Itec.zkclient.ZkClient - zookeeper state changed > > > > > (Disconnected) > > > > > > > > > [2014-03-20 21:12:27,252] 685698838 > > > > > > > > > [main-SendThread(analytics1025.eqiad.wmnet:2181)] INFO > > > > > > > > > org.apache.zookeeper.ClientCnxn - Opening socket connection > > to > > > > > server > > > > > > > > > analytics1023.eqiad.wmnet/10.64.5.15:2181 > > > > > > > > > [2014-03-20 21:12:27,271] 685698857 > > > > > > > > > [main-SendThread(analytics1023.eqiad.wmnet:2181)] INFO > > > > > > > > > org.apache.zookeeper.ClientCnxn - Socket connection > > established > > > > to > > > > > > > > > analytics1023.eqiad.wmnet/10.64.5.15:2181, initiating > > session > > > > > > > > > [2014-03-20 21:12:27,273] 685698859 [main-EventThread] INFO > > > > > > > > > org.I0Itec.zkclient.ZkClient - zookeeper state changed > > > (Expired) > > > > > > > > > [2014-03-20 21:12:27,273] 685698859 > > > > > > > > > [main-SendThread(analytics1023.eqiad.wmnet:2181)] INFO > > > > > > > > > org.apache.zookeeper.ClientCnxn - Unable to reconnect to > > > > ZooKeeper > > > > > > > > service, > > > > > > > > > session 0x1425d903a3f1f89 has expired, closing socket > > > connection > > > > > > > > > [2014-03-20 21:12:27,273] 685698859 [main-EventThread] INFO > > > > > > > > > org.apache.zookeeper.ZooKeeper - Initiating client > > connection, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > connectString=analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad > > > > > > > > > sessionTimeout=6000 > > > watcher=org.I0Itec.zkclient.ZkClient@63203b59 > > > > > > > > > [2014-03-20 21:12:27,290] 685698876 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [ZkClient-EventThread-24-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] > > > > > > > > > INFO kafka.server.KafkaZooKeeper - re-registering broker > info > > > in > > > > ZK > > > > > > for > > > > > > > > > broker 21 > > > > > > > > > [2014-03-20 21:12:27,290] 685698876 [main-EventThread] INFO > > > > > > > > > org.apache.zookeeper.ClientCnxn - EventThread shut down > > > > > > > > > [2014-03-20 21:12:27,290] 685698876 [main-SendThread()] > INFO > > > > > > > > > org.apache.zookeeper.ClientCnxn - Opening socket connection > > to > > > > > server > > > > > > > > > analytics1023.eqiad.wmnet/10.64.5.15:2181 > > > > > > > > > [2014-03-20 21:12:27,292] 685698878 > > > > > > > > > [main-SendThread(analytics1023.eqiad.wmnet:2181)] INFO > > > > > > > > > org.apache.zookeeper.ClientCnxn - Socket connection > > established > > > > to > > > > > > > > > analytics1023.eqiad.wmnet/10.64.5.15:2181, initiating > > session > > > > > > > > > [2014-03-20 21:12:27,313] 685698899 > > > > > > > > > [main-SendThread(analytics1023.eqiad.wmnet:2181)] INFO > > > > > > > > > org.apache.zookeeper.ClientCnxn - Session establishment > > > complete > > > > on > > > > > > > > server > > > > > > > > > analytics1023.eqiad.wmnet/10.64.5.15:2181, sessionid = > > > > > > > > > 0xff429b4201dd2352, negotiated timeout = 6000 > > > > > > > > > [2014-03-20 21:12:27,314] 685698900 [main-EventThread] INFO > > > > > > > > > org.I0Itec.zkclient.ZkClient - zookeeper state changed > > > > > > (SyncConnected) > > > > > > > > > [2014-03-20 21:12:27,324] 685698910 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [ZkClient-EventThread-24-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] > > > > > > > > > INFO kafka.server.KafkaZooKeeper - done re-registering > broker > > > > > > > > > [2014-03-20 21:12:27,324] 685698910 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [ZkClient-EventThread-24-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] > > > > > > > > > INFO kafka.server.KafkaZooKeeper - Subscribing to > > > /brokers/topics > > > > > > path > > > > > > > to > > > > > > > > > watch for new topics > > > > > > > > > [2014-03-20 21:12:27,370] 685698956 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [ZkClient-EventThread-24-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] > > > > > > > > > INFO > > kafka.server.ZookeeperLeaderElector$LeaderChangeListener - > > > > New > > > > > > > > leader > > > > > > > > > is 22 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >