Dan, Just to make sure I understand it correctly. What do you mean by different ip -> broker mapping? Do you mean you changed your broker ip? We have different mechanism in and producer to get the cluster information. Consumer get all the information from Zookeeper while producer has to talk to broker directly. We have found some problem in the producer in current trunk and are fixing it. Just want to see if your scenario indicates a new issue that we havenĀ¹t addressed.
Thanks. Jiangjie (Becket) Qin On 5/8/15, 9:48 AM, "Mayuresh Gharat" <gharatmayures...@gmail.com> wrote: >It should do a updateMetadataRequest in case it gets NOT_LEADER_FOR >PARTITION. This looks like a bug. > >Thanks, > >Mayuresh > >On Fri, May 8, 2015 at 8:53 AM, Dan <danharve...@gmail.com> wrote: > >> Hi, >> >> We've noticed an issue on our staging environment where all 3 of our >>Kafka >> hosts shutdown and came back with a different ip -> broker id mapping. I >> know this is not good and we're fixing that separately. But what we >>noticed >> is all the consumers recovered but the producers got stuck with the >> following exceptions: >> >> WARN 2015-05-08 09:19:56,347 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544968 on topic-partition >> samza-metrics-0, retrying (2145750068 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:56,448 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544970 on topic-partition >> samza-metrics-0, retrying (2145750067 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:56,549 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544972 on topic-partition >> samza-metrics-0, retrying (2145750066 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:56,649 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544974 on topic-partition >> samza-metrics-0, retrying (2145750065 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:56,749 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544976 on topic-partition >> samza-metrics-0, retrying (2145750064 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:56,850 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544978 on topic-partition >> samza-metrics-0, retrying (2145750063 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:56,949 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544980 on topic-partition >> samza-metrics-0, retrying (2145750062 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:57,049 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544982 on topic-partition >> samza-metrics-0, retrying (2145750061 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:57,150 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544984 on topic-partition >> samza-metrics-0, retrying (2145750060 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:57,254 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544986 on topic-partition >> samza-metrics-0, retrying (2145750059 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:57,351 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544988 on topic-partition >> samza-metrics-0, retrying (2145750058 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> WARN 2015-05-08 09:19:57,454 >> org.apache.kafka.clients.producer.internals.Sender: Got error produce >> response with correlation id 3544990 on topic-partition >> samza-metrics-0, retrying (2145750057 attempts left). Error: >> NOT_LEADER_FOR_PARTITION >> >> >> So it appears as if the producer did not refresh the metadata once the >> brokers had come back up. The exceptions carried on for a few hours >>until >> we restarted them. >> >> We noticed this in both 0.8.2.1 Java clients and via, Kakfa-rest >> https://github.com/confluentinc/kafka-rest which is using 0.8.2.0-cp. >> >> Is this a known issue when all brokers go away, or is it a subtle bug >>we've >> hit? >> >> Thanks, >> Dan >> > > > >-- >-Regards, >Mayuresh R. Gharat >(862) 250-7125