I created the following bug report: https://issues.apache.org/jira/browse/KAFKA-3088
-----Original Message----- From: "Dave Peterson" <d...@dspeterson.com> Sent: Monday, January 11, 2016 9:56am To: users@kafka.apache.org Subject: Re: trouble sending produce requests to 0.9.0.0 broker cluster Ok, thanks for the information. I tried setting the client ID to "bruce" and the problem disappeared. I looked at https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol and it doesn't say anything about an empty client ID not being allowed (even if it isn't allowed, Kafka shouldn't crash on receiving such a request). Therefore I think this should be considered a bug, and a regression from 0.8. As a workaround, I'll avoid sending requests with an empty client ID. Thanks, Dave On 1/11/2016 9:22 AM, Dana Powers wrote: > Looks like you aren't setting the request client-id, and server is crashing > on it. I'm not sure whether server api is expected to work w/o client-id, > but you can probably fix by sending one. Fwiw, kafka-python sends > 'kafka-python' unless user specifies something else. > > -Dana > On Jan 11, 2016 8:41 AM, <d...@dspeterson.com> wrote: > >> I forgot to include some information about this problem. When I sent >> the produce request, the following appeared in server.log: >> >> [2016-01-10 23:03:44,957] ERROR [KafkaApi-3] error when handling >> request Name: ProducerRequest; Version: 0; CorrelationId: 1; ClientId: >> null; RequiredAcks: 1; AckTimeoutMs: 10000 ms; TopicAndPartition: >> [topic_1,3] -> 37 (kafka.server.KafkaApis) >> java.lang.NullPointerException >> at >> org.apache.kafka.common.metrics.JmxReporter.getMBeanName(JmxReporter.java:127) >> at >> org.apache.kafka.common.metrics.JmxReporter.addAttribute(JmxReporter.java:106) >> at >> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:76) >> at >> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288) >> at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177) >> at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162) >> at >> kafka.server.ClientQuotaManager.getOrCreateQuotaSensors(ClientQuotaManager.scala:209) >> at >> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:111) >> at >> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$2(KafkaApis.scala:353) >> at >> kafka.server.KafkaApis$$anonfun$handleProducerRequest$1.apply(KafkaApis.scala:371) >> at >> kafka.server.KafkaApis$$anonfun$handleProducerRequest$1.apply(KafkaApis.scala:371) >> at >> kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:348) >> at >> kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:366) >> at kafka.server.KafkaApis.handle(KafkaApis.scala:68) >> at >> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60) >> at java.lang.Thread.run(Thread.java:745) >> >> Later when I tried sending another produce request I got a somewhat >> similar error: >> >> [2016-01-11 08:15:05,153] ERROR [KafkaApi-3] error when handling >> request Name: ProducerRequest; Version: 0; CorrelationId: 1; ClientId: >> null; RequiredAcks: 1; AckTimeoutMs: 10000 ms; TopicAndPartition: >> [topic_1,3] -> 37 (kafka.server.KafkaApis) >> java.lang.IllegalArgumentException: A metric named 'MetricName >> [name=throttle-time, group=Produce, description=Tracking average >> throttle-time per client, tags={client-id=null}]' already exists, can't >> register another one. >> at >> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:285) >> at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177) >> at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162) >> at >> kafka.server.ClientQuotaManager.getOrCreateQuotaSensors(ClientQuotaManager.scala:209) >> at >> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:111) >> at >> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$2(KafkaApis.scala:353) >> at >> kafka.server.KafkaApis$$anonfun$handleProducerRequest$1.apply(KafkaApis.scala:371) >> at >> kafka.server.KafkaApis$$anonfun$handleProducerRequest$1.apply(KafkaApis.scala:371) >> at >> kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:348) >> at >> kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:366) >> at kafka.server.KafkaApis.handle(KafkaApis.scala:68) >> at >> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60) >> at java.lang.Thread.run(Thread.java:745) >> >> In both cases I was running kafka-console-consumer.sh to consume >> messages from the topic I was sending to, and the consumer did see the >> message I sent. >> >> Dave >> >> >> -----Original Message----- >> From: d...@dspeterson.com >> Sent: Monday, January 11, 2016 12:32am >> To: users@kafka.apache.org >> Subject: trouble sending produce requests to 0.9.0.0 broker cluster >> >> Hi, >> >> I'm having trouble sending produce requests to a Kafka 0.9.0.0 broker >> cluster consisting of 4 brokers with IDs 0, 1, 2, and 3. All 4 >> brokers are running locally on my CentOS 7 development box, listening >> on ports 9092, 9093, 9094, and 9095 respectively. I am running my >> own producer code (Bruce, see https://github.com/ifwe/bruce), which >> works without problems with Kafka 0.8, but has problems with 0.9.0.0. >> When I send a produce request consisting of a single message, I often >> get a response consisting of error ACK -1 (Unknown, unexpected server >> error) although I have also seen other errors such as 6 >> (NotLeaderForPartition). During one observed instance of this >> behavior I saw the following: >> >> sent produce request >> -------------------- >> - single message set containing a single message >> - topic: "topic_1" >> - partition: 3 >> - empty key >> - value: "hello world" >> - API key: 0 (produce request) >> - API version: 0 >> - correlation ID: 1 >> - empty client ID >> - required ACKs: 1 >> - timeout: 10000 >> - message magic byte: 0 >> - message attributes: 0 (no compression) >> - destination broker ID: 3 >> >> received produce response >> ------------------------- >> - correlation ID: 1 >> - topic: "topic_1" >> - partition: 3 >> - error code: -1 >> >> I have included tcpdump output for the above request and response >> below. The topic/partition layout is as follows: >> >> $ bin/kafka-topics.sh --describe --zookeeper >> localhost:2181/kafka-0.9.0.0 >> Topic:topic_1 PartitionCount:4 ReplicationFactor:2 >> Configs: >> Topic: topic_1 Partition: 0 Leader: 0 Replicas: 0,1 >> Isr: 0,1 >> Topic: topic_1 Partition: 1 Leader: 1 Replicas: 1,2 >> Isr: 2,1 >> Topic: topic_1 Partition: 2 Leader: 2 Replicas: 2,3 >> Isr: 2,3 >> Topic: topic_1 Partition: 3 Leader: 3 Replicas: 3,0 >> Isr: 0,3 >> $ >> >> Does anyone have any ideas about what may be causing this behavior? >> >> Additionally, I was using tcpdump to watch communication between >> kafka-console-producer.sh and the broker cluster, and noticed that >> the console producer sends a value of 1 for the API version in its >> produce requests. Is there any documentation describing protocol >> changes between API versions 0 and 1? I looked here: >> >> >> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol >> >> and didn't see anything about the new API version. Any help is much >> appreciated. >> >> Thanks, >> Dave >> >> >> Produce request... >> >> # tcpdump -i lo -X -s 0 -nn src port 51147 >> tcpdump: verbose output suppressed, use -v or -vv for full protocol >> decode >> listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes >> 23:03:44.947589 IP 127.0.0.1.51147 > 127.0.0.1.9095: Flags [P.], seq >> 2040824198:2040824280, ack 1758705421, win 342, options [nop,nop,TS val >> 71321820 ecr 71222139], length 82 >> 0x0000: 4500 0086 4300 4000 4006 f96f 7f00 0001 E...C.@.@..o.... >> 0x0010: 7f00 0001 c7cb 2387 79a4 8186 68d3 b70d ......#.y...h... >> 0x0020: 8018 0156 fe7a 0000 0101 080a 0440 48dc ...V.z.......@H. >> 0x0030: 043e c37b 0000 004e 0000 0000 0000 0001 .>.{...N........ >> 0x0040: ffff 0001 0000 2710 0000 0001 0007 746f ......'.......to >> 0x0050: 7069 635f 3100 0000 0100 0000 0300 0000 pic_1........... >> 0x0060: 2500 0000 0000 0000 0000 0000 1973 acf7 %............s.. >> 0x0070: 7c00 00ff ffff ff00 0000 0b68 656c 6c6f |..........hello >> 0x0080: 2077 6f72 6c64 .world >> 23:03:44.956525 IP 127.0.0.1.51147 > 127.0.0.1.9095: Flags [.], ack >> 40, win 342, options [nop,nop,TS val 71321830 ecr 71321830], length 0 >> 0x0000: 4500 0034 4301 4000 4006 f9c0 7f00 0001 E..4C.@.@....... >> 0x0010: 7f00 0001 c7cb 2387 79a4 81d8 68d3 b734 ......#.y...h..4 >> 0x0020: 8010 0156 fe28 0000 0101 080a 0440 48e6 ...V.(.......@H. >> 0x0030: 0440 48e6 .@H. >> ^C >> 2 packets captured >> 4 packets received by filter >> 0 packets dropped by kernel >> # >> >> Produce response... >> >> # tcpdump -i lo -X -s 0 -nn dst port 51147 >> tcpdump: verbose output suppressed, use -v or -vv for full protocol >> decode >> listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes >> 23:03:44.947661 IP 127.0.0.1.9095 > 127.0.0.1.51147: Flags [.], ack >> 2040824280, win 32742, options [nop,nop,TS val 71321821 ecr 71321820], >> length 0 >> 0x0000: 4500 0034 7476 4000 4006 c84b 7f00 0001 E..4tv@.@..K.... >> 0x0010: 7f00 0001 2387 c7cb 68d3 b70d 79a4 81d8 ....#...h...y... >> 0x0020: 8010 7fe6 fe28 0000 0101 080a 0440 48dd .....(.......@H. >> 0x0030: 0440 48dc .@H. >> 23:03:44.956499 IP 127.0.0.1.9095 > 127.0.0.1.51147: Flags [P.], seq >> 0:39, ack 1, win 32742, options [nop,nop,TS val 71321830 ecr 71321820], >> length 39 >> 0x0000: 4500 005b 7477 4000 4006 c823 7f00 0001 E..[tw@.@..#.... >> 0x0010: 7f00 0001 2387 c7cb 68d3 b70d 79a4 81d8 ....#...h...y... >> 0x0020: 8018 7fe6 fe4f 0000 0101 080a 0440 48e6 .....O.......@H. >> 0x0030: 0440 48dc 0000 0023 0000 0001 0000 0001 .@H....#........ >> 0x0040: 0007 746f 7069 635f 3100 0000 0100 0000 ..topic_1....... >> 0x0050: 03ff ffff ffff ffff ffff ff ........... >> ^C >> 2 packets captured >> 4 packets received by filter >> 0 packets dropped by kernel >> # >> >> >> >> >>