Found the problem - it is a bug with Partitions of kafka client. Can you guys confirm and patch in kafka clients?
for (int i = 0; i < numPartitions; i++) { int partition = Utils.abs(counter.getAndIncrement()) % numPartitions; if (partitions.get(partition).leader() != null) { return partitions.get(partition).partition(); } } On Fri, Feb 20, 2015 at 2:35 PM, Xiaoyu Wang <xw...@rocketfuel.com> wrote: > Update: > > I am using kafka.clients 0.8.2-beta. Below are the test steps > > 1. setup local kafka clusters with 2 brokers, 0 and 1 > 2. create topic X with replication fact 1 and 4 partitions > 3. verify that each broker has two partitions > 4. shutdown broker 1 > 5. start a producer sending data to topic X using KafkaProducer with > required ack = 1 > 6. producer hangs and does not exit. > > Offline partitions were take care of when the partitions is null (code > attached below). However, the timeout setting does not seem to work. Not > sure what caused KafkaProducer to hang. > > // choose the next available node in a round-robin fashion > for (int i = 0; i < numPartitions; i++) { > int partition = Utils.abs(counter.getAndIncrement()) % numPartitions; > if (partitions.get(partition).leader() != null) > return partition; > } > // no partitions are available, give a non-available partition > return Utils.abs(counter.getAndIncrement()) % numPartitions; > > > > > > On Fri, Feb 20, 2015 at 1:48 PM, Xiaoyu Wang <xw...@rocketfuel.com> wrote: > >> Hello, >> >> I am experimenting sending data to kafka using KafkaProducer and found >> that when a partition is completely offline, e.g. a topic with replication >> factor = 1 and some broker is down, KafkaProducer seems to be hanging >> forever. Not even exit with the timeout setting. Can you take a look? >> >> I checked code and found that the partitioner create partition based on >> the total partition number - including those offline partitions. Is it >> possible that we change ProducerClient to ignore offline partitions? >> >> >> Thanks, >> >> -Xiaoyu >> >> >