How to set partition.assignment.strategy at run time?

2018-01-24 Thread Jun MA
Hi all, I have a custom PartitionAssignor class that would need a parameter to construct the class. So instead of specify partition.assignment.strategy with the class name in the consumer properties, how could I do it at runtime? I’m using kafka 0.9 java client. Thanks, Jun

Log compaction failed because offset map doesn't have enough space

2017-05-16 Thread Jun Ma
Hi team, We are having a issue with compacting __consumer_offsets topic in our cluster. We’re seeing logs in log-cleaner.log saying: [2017-05-16 11:56:28,993] INFO Cleaner 0: Building offset map for log __consumer_offsets-15 for 349 segments in offset range [0, 619265471). (kafka.log.LogCleaner)

Re: Does Kafka producer waits till previous batch returns responce before sending next one?

2017-04-30 Thread Jun MA
Does this mean that if the client have retry > 0 and max.in.flight.requests.per.connection > 1, then even if the topic only have one partition, there’s still no guarantee of the ordering? Thanks, Jun > On Apr 30, 2017, at 7:57 AM, Hans Jespersen wrote: > > There is a parameter that controls th

Re: Elegant way to failover controller

2017-04-05 Thread Jun MA
This will cause the current controller > to resign and a new one to be elected, but it’s random as to what broker > that will be. > > A better question here is why do you want to move the controller? > > -Todd > > On Wed, Apr 5, 2017 at 9:09 AM, Jun MA wrote: > >

Elegant way to failover controller

2017-04-05 Thread Jun MA
Hi, We are running kafka 0.9.0.1 and I’m looking for a elegant way to failover controller to other brokers. Right now I have to restart the broker to failover it to other brokers. Is there a way to failover controller to a specific broker? Is there a way to failover it without restart the broke

Re: ISR churn

2017-03-22 Thread Jun MA
Hi David, I checked our cluster, the producer purgatory size is under 3 mostly. But I’m not quite understand this metrics, could you please explain it a little bit? Thanks, Jun > On Mar 22, 2017, at 3:07 PM, David Garcia wrote: > > producer purgatory size

Re: ISR churn

2017-03-22 Thread Jun MA
27;s no target release for fixing: > > https://issues.apache.org/jira/browse/KAFKA-4674 > <https://issues.apache.org/jira/browse/KAFKA-4674> > > Jun Ma, what exactly did you do to failover the controller to a new broker? > If that works for you, I'd like to try

Recommended number of partitions on each broker

2017-03-01 Thread Jun MA
Hi, I’m curious what’s the recommended number of partitions running on each individual broker? We have a 3 nodes clusters running 0.9.0.1, each one has 24 cores, 1.1T ssd, 48G ram, 10G NIC. There’s about 300 partitions running on each broker and the resource usage is pretty low (5% cpu, 50% ram

Re: Frequently shrink/expand ISR on one broker

2017-02-23 Thread Jun MA
Feb 19, 2017, at 2:13 PM, Jun MA wrote: > > Hi team, > > We are running confluent 0.9.0.1 on a cluster with 6 brokers. These days one > of our broker(broker 1) frequently shrink the ISR and expand it immediately > every about 20 minutes and I couldn’t find out why. Based on t

Re: Question about messages in __consumer_offsets topic

2017-02-22 Thread Jun MA
, Jun > On Feb 22, 2017, at 4:37 PM, Todd Palino wrote: > > __consumer_offsets is a log-compacted topic, and a NULL body indicates a > delete tombstone. So it means to delete the entry that matches the key > (group, topic, partition tuple). > > -Todd > > > > On

Question about messages in __consumer_offsets topic

2017-02-22 Thread Jun MA
Hi guys, I’m trying to consume from __consumer_offsets topic to get exact committed offset of each consumer. Normally I can see messages like: [eds-els-recopp-jenkins-01-5651,eds-incre-staging-1,0]::[OffsetMetadata[29791925,NO_METADATA],CommitTime 1487090167367,ExpirationTime 1487176567367],

Re: JMX metrics for replica lag time

2017-02-21 Thread Jun MA
+),partition=([0-9]+) > lag > should be proportional to the maximum batch size of a produce request. > > On Mon, Feb 20, 2017 at 5:43 PM, Jun Ma wrote: > >> Hi Guozhang, >> >> Thanks for your replay. Could you tell me which one indicates the lag >> between

Re: JMX metrics for replica lag time

2017-02-20 Thread Jun Ma
0.10.x they are still the same as stated in: > > https://kafka.apache.org/documentation/#monitoring > > The mechanism for determine which followers have been dropped out of ISR > has changed, but the metrics are not. > > > Guozhang > > > On Sun, Feb 19, 2017 at 7

JMX metrics for replica lag time

2017-02-19 Thread Jun MA
Hi, I’m looking for the JMX metrics to represent replica lag time for 0.9.0.1. Base on the documentation, I can only find kafka.server:type=ReplicaFetcherManager,name=MaxLag,clientId=Replica, which is max lag in messages btw follower and leader replicas. But since in 0.9.0.1 lag in messages is

Frequently shrink/expand ISR on one broker

2017-02-19 Thread Jun MA
Hi team, We are running confluent 0.9.0.1 on a cluster with 6 brokers. These days one of our broker(broker 1) frequently shrink the ISR and expand it immediately every about 20 minutes and I couldn’t find out why. Based on the log, I can kick out any of other brokers, not just a specific one. H

Re: Halting because log truncation is not allowed for topic __consumer_offsets

2016-12-21 Thread Jun MA
Hi Peter, We’ve seen this happen under normal operation in our virtualized environment as well. Our network is not very stable, blips happen pretty frequently. Your explanation sounds reasonable to me, I’m very interested in your further thought on this. In our case, we’re using quorum based r

Re: Halting because log truncation is not allowed for topic __consumer_offsets

2016-12-20 Thread Jun MA
gt; <https://cwiki.apache.org/confluence/display/KAFKA/KIP-101+-+Alter+Replication+Protocol+to+use+Leader+Generation+rather+than+High+Watermark+for+Truncation>>. > If you can reproduce in a more generally scenario we would be very > interested. > > All the best > B > > &g

Halting because log truncation is not allowed for topic __consumer_offsets

2016-12-18 Thread Jun MA
Would be grateful to hear opinions from experts out there. Thanks in advance > On Dec 17, 2016, at 11:06 AM, Jun MA wrote: > > Hi, > > We saw the following FATAL error in 2 of our brokers (3 in total, the active > controller doesn’t have this) and they crashed in the same ti

Halting because log truncation is not allowed for topic __consumer_offsets

2016-12-16 Thread Jun MA
Hi, We saw the following FATAL error in 2 of our brokers (3 in total, the active controller doesn’t have this) and they crashed in the same time. [2016-12-16 16:12:47,085] FATAL [ReplicaFetcherThread-0-3], Halting because log truncation is not allowed for topic __consumer_offsets, Current lead

Re: Consumer group keep bouncing between generation id 0 and 1

2016-11-19 Thread Jun Ma
uery all the consumer groups and > filter by their subscribed topics. > > > Guozhang > > > > > On Fri, Nov 18, 2016 at 4:19 PM, Jun MA wrote: > > > Hi guys, > > > > I’m using kafka 0.9.0.1 and Java client. I saw the following exceptions > > throw by m

Consumer group keep bouncing between generation id 0 and 1

2016-11-18 Thread Jun MA
Hi guys, I’m using kafka 0.9.0.1 and Java client. I saw the following exceptions throw by my consumer: Caused by: java.lang.IllegalStateException: Correlation id for response (767587) does not match request (767585) at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:

Lost __consumer_offsets and _schemas topic level config in zookeeper

2016-08-17 Thread Jun MA
Hi, We are using confluent 0.9.0.1 and we’ve noticed recently that we lost __consumer_offsets and _schemas (used for schema registry) topic level config. We checked zookeeper config/topics/ and found that there is no __consumer_offsets and _schemas topic. Our server level config use cleanup po

How to manually bring offline partition back?

2016-07-14 Thread Jun MA
Hi all, We have some partitions that only have 1 replication, and after the broker failed, partitions on that broker becomes unavailable. We set unclean.leader.election.enable=false, so the controller doesn’t bring those partitions back online even after that broker is up. We tried Preferred Re

Re: kafka 0.9 offset unknown after cleanup

2016-05-03 Thread Jun MA
commit any offset for offsets.retention.minutes, kafka will clean up its offset, which make sense for my case. Thanks, Jun > On May 3, 2016, at 11:46 AM, Jun MA wrote: > > Thanks for your reply. I checked the offset topic and the cleanup policy is > actually compact. > >

Re: kafka 0.9 offset unknown after cleanup

2016-05-03 Thread Jun MA
ijs wrote: > > Looks like it, you need to be sure the offset topic is using compaction, > and the broker is set to enable compaction. > > On Tue, May 3, 2016 at 9:56 AM Jun MA wrote: > >> Hi, >> >> I’m using 0.9.0.1 new-consumer api. I noticed that after kafka

kafka 0.9 offset unknown after cleanup

2016-05-03 Thread Jun MA
Hi, I’m using 0.9.0.1 new-consumer api. I noticed that after kafka cleans up all old log segments(reach delete.retention time), I got unknown offset. bin/kafka-consumer-groups.sh --bootstrap-server server:9092 --new-consumer --group testGroup --describe GROUP, TOPIC, PARTITION, CURRENT OFFSET,