[ https://issues.apache.org/jira/browse/KAFKA-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15391058#comment-15391058 ]
ASF GitHub Bot commented on KAFKA-3689: --------------------------------------- GitHub user rnpridgeon opened a pull request: https://github.com/apache/kafka/pull/1658 Kafka 3689 - process only distinct connectionIds within processDisconnected() @ijuma discovered a possible scenario in which connectionQuotas may be [doubly-decremented](https://issues.apache.org/jira/browse/KAFKA-3689?focusedCommentId=15385962&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15385962) This PR intends to remove that possibility by first filtering out any duplicate connectionIds. This way each You can merge this pull request into a Git repository by running: $ git pull https://github.com/rnpridgeon/kafka KAFKA-3689 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1658.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1658 ---- commit 59077eff1e26eb6bae590188d9d1d6fa1166cfe9 Author: rnpridgeon <ryan.n.pridg...@gmail.com> Date: 2016-07-21T22:09:47Z KAFKA-3983 - Add additional information to debug logging to aid in debugging efforts commit 35e85eb4871a4b76f3bb2141a971c92140a0c338 Author: rnpridgeon <ryan.n.pridg...@gmail.com> Date: 2016-07-22T14:29:25Z Edits per review in pull 1648 commit 44a1b0dce6c8f0948ed24ae23d35ce0f16fff860 Author: rnpridgeon <ryan.n.pridg...@gmail.com> Date: 2016-07-23T13:11:25Z Merge remote-tracking branch 'upstream/trunk' into trunk commit a2d8acf96d470319542ca9db3e5260220c27d22c Author: rnpridgeon <ryan.n.pridg...@gmail.com> Date: 2016-07-24T12:54:26Z KAFKA-3689 - process only distinct connectionIds within processDisconnected() ---- > Exception when attempting to decrease connection count for address with no > connections > -------------------------------------------------------------------------------------- > > Key: KAFKA-3689 > URL: https://issues.apache.org/jira/browse/KAFKA-3689 > Project: Kafka > Issue Type: Bug > Components: network > Affects Versions: 0.9.0.1 > Environment: ubuntu 14.04, > java version "1.7.0_95" > OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.2) > OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode) > 3 broker cluster (all 3 servers identical - Intel Xeon E5-2670 @2.6GHz, > 8cores, 16 threads 64 GB RAM & 1 TB Disk) > Kafka Cluster is managed by 3 server ZK cluster (these servers are different > from Kafka broker servers). All 6 servers are connected via 10G switch. > Producers run from external servers. > Reporter: Buvaneswari Ramanan > Assignee: Jun Rao > Fix For: 0.10.1.0, 0.10.0.1 > > Attachments: KAFKA-3689.log.redacted, kafka-3689-instrumentation.patch > > Original Estimate: 72h > Remaining Estimate: 72h > > As per Ismael Juma's suggestion in email thread to us...@kafka.apache.org > with the same subject, I am creating this bug report. > The following error occurs in one of the brokers in our 3 broker cluster, > which serves about 8000 topics. These topics are single partitioned with a > replication factor = 3. Each topic gets data at a low rate – 200 bytes per > sec. Leaders are balanced across the topics. > Producers run from external servers (4 Ubuntu servers with same config as the > brokers), each producing to 2000 topics utilizing kafka-python library. > This error message occurs repeatedly in one of the servers. Between the hours > of 10:30am and 1:30pm on 5/9/16, there were about 10 Million such > occurrences. This was right after a cluster restart. > This is not the first time we got this error in this broker. In those > instances, error occurred hours / days after cluster restart. > ===================================================== > [2016-05-09 10:38:43,932] ERROR Processor got uncaught exception. > (kafka.network.Processor) > java.lang.IllegalArgumentException: Attempted to decrease connection count > for address with no connections, address: /X.Y.Z.144 (actual network address > masked) > at > kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565) > at > kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565) > at scala.collection.MapLike$class.getOrElse(MapLike.scala:128) > at scala.collection.AbstractMap.getOrElse(Map.scala:59) > at kafka.network.ConnectionQuotas.dec(SocketServer.scala:564) > at > kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:450) > at > kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:445) > at scala.collection.Iterator$class.foreach(Iterator.scala:742) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at kafka.network.Processor.run(SocketServer.scala:445) > at java.lang.Thread.run(Thread.java:745) > [2016-05-09 10:38:43,932] ERROR Processor got uncaught exception. > (kafka.network.Processor) > java.lang.IllegalArgumentException: Attempted to decrease connection count > for address with no connections, address: /X.Y.Z.144 > at > kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565) > at > kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565) > at scala.collection.MapLike$class.getOrElse(MapLike.scala:128) > at scala.collection.AbstractMap.getOrElse(Map.scala:59) > at kafka.network.ConnectionQuotas.dec(SocketServer.scala:564) > at > kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:450) > at > kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:445) > at scala.collection.Iterator$class.foreach(Iterator.scala:742) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at kafka.network.Processor.run(SocketServer.scala:445) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)