log.cleanup.policy is delete not compact. log.cleaner.enable=true log.cleaner.threads=5 log.cleanup.policy=delete log.flush.scheduler.interval.ms=3000 log.retention.minutes=1440 log.segment.bytes=1073741824 (1gb)
Messages are keyed but not compressed, producer async and uses kafka default partitioner. String message = msg.getString(); String uniqKey = ""+rnd.nextInt();// random key String partKey = getPartitionKey();// partition key KeyedMessage<String, String> data = new KeyedMessage<String, String>(this.topicName, uniqKey, partKey, message); producer.send(data); Thanks Zakee > On Mar 14, 2015, at 4:23 PM, gharatmayures...@gmail.com wrote: > > Is your topic log compacted? Also if it is are the messages keyed? Or are the > messages compressed? > > Thanks, > > Mayuresh > > Sent from my iPhone > >> On Mar 14, 2015, at 2:02 PM, Zakee <kzak...@netzero.net >> <mailto:kzak...@netzero.net>> wrote: >> >> Thanks, Jiangjie for helping resolve the kafka controller migration driven >> partition leader rebalance issue. The logs are much cleaner now. >> >> There are a few incidences of Out of range offset even though there is no >> consumers running, only producers and replica fetchers. I was trying to >> relate to a cause, looks like compaction (log segment deletion) causing >> this. Not sure whether this is expected behavior. >> >> Broker-4: >> [2015-03-14 07:46:52,338] ERROR [Replica Manager on Broker 4]: Error when >> processing fetch request for partition [Topic22kv,5] offset 1754769769 from >> follower with correlation id 1645671. Possible cause: Request for offset >> 1754769769 but we only have log segments in the range 1400864851 to >> 1754769732. (kafka.server.ReplicaManager) >> >> Broker-3: >> [2015-03-14 07:46:52,356] INFO The cleaning for partition [Topic22kv,5] is >> aborted and paused (kafka.log.LogCleaner) >> [2015-03-14 07:46:52,408] INFO Scheduling log segment 1400864851 for log >> Topic22kv-5 for deletion. (kafka.log.Log) >> … >> [2015-03-14 07:46:52,421] INFO Compaction for partition [Topic22kv,5] is >> resumed (kafka.log.LogCleaner) >> [2015-03-14 07:46:52,517] ERROR [ReplicaFetcherThread-2-4], Current offset >> 1754769769 for partition [Topic22kv,5] out of range; reset offset to >> 1400864851 (kafka.server.ReplicaFetcherThread) >> [2015-03-14 07:46:52,517] WARN [ReplicaFetcherThread-2-4], Replica 3 for >> partition [Topic22kv,5] reset its fetch offset from 1400864851 to current >> leader 4's start offset 1400864851 (kafka.server.ReplicaFetcherThread) >> >> ____________________________________________________________ >> Old School Yearbook Pics >> View Class Yearbooks Online Free. Search by School & Year. Look Now! >> http://thirdpartyoffers.netzero.net/TGL3231/5504a2032e49422021991st02vuc >> <http://thirdpartyoffers.netzero.net/TGL3231/5504a2032e49422021991st02vuc> >> <topic22kv_746a_314_logs.txt> >> >> >> Thanks >> Zakee >> >>> On Mar 9, 2015, at 12:18 PM, Zakee <kzak...@netzero.net> wrote: >>> >>> No broker restarts. >>> >>> Created a kafka issue: https://issues.apache.org/jira/browse/KAFKA-2011 >>> <https://issues.apache.org/jira/browse/KAFKA-2011> >>> >>>>> Logs for rebalance: >>>>> [2015-03-07 16:52:48,969] INFO [Controller 2]: Resuming preferred replica >>>>> election for partitions: (kafka.controller.KafkaController) >>>>> [2015-03-07 16:52:48,969] INFO [Controller 2]: Partitions that completed >>>>> preferred replica election: (kafka.controller.KafkaController) >>>>> … >>>>> [2015-03-07 12:07:06,783] INFO [Controller 4]: Resuming preferred replica >>>>> election for partitions: (kafka.controller.KafkaController) >>>>> ... >>>>> [2015-03-07 09:10:41,850] INFO [Controller 3]: Resuming preferred replica >>>>> election for partitions: (kafka.controller.KafkaController) >>>>> ... >>>>> [2015-03-07 08:26:56,396] INFO [Controller 1]: Starting preferred replica >>>>> leader election for partitions (kafka.controller.KafkaController) >>>>> ... >>>>> [2015-03-06 16:52:59,506] INFO [Controller 2]: Partitions undergoing >>>>> preferred replica election: (kafka.controller.KafkaController) >>>>> >>>>> Also, I still see lots of below errors (~69k) going on in the logs since >>>>> the restart. Is there any other reason than rebalance for these errors? >>>>> >>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for >>>>> partition [Topic-11,7] to broker 5:class >>>>> kafka.common.NotLeaderForPartitionException >>>>> (kafka.server.ReplicaFetcherThread) >>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for >>>>> partition [Topic-2,25] to broker 5:class >>>>> kafka.common.NotLeaderForPartitionException >>>>> (kafka.server.ReplicaFetcherThread) >>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for >>>>> partition [Topic-2,21] to broker 5:class >>>>> kafka.common.NotLeaderForPartitionException >>>>> (kafka.server.ReplicaFetcherThread) >>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for >>>>> partition [Topic-22,9] to broker 5:class >>>>> kafka.common.NotLeaderForPartitionException >>>>> (kafka.server.ReplicaFetcherThread) >>> >>> >>>> Could you paste the related logs in controller.log? >>> What specifically should I search for in the logs? >>> >>> Thanks, >>> Zakee >>> >>> >>> >>>> On Mar 9, 2015, at 11:35 AM, Jiangjie Qin <j...@linkedin.com.INVALID >>>> <mailto:j...@linkedin.com.INVALID>> wrote: >>>> >>>> Is there anything wrong with brokers around that time? E.g. Broker restart? >>>> The log you pasted are actually from replica fetchers. Could you paste the >>>> related logs in controller.log? >>>> >>>> Thanks. >>>> >>>> Jiangjie (Becket) Qin >>>> >>>>> On 3/9/15, 10:32 AM, "Zakee" <kzak...@netzero.net >>>>> <mailto:kzak...@netzero.net>> wrote: >>>>> >>>>> Correction: Actually the rebalance happened quite until 24 hours after >>>>> the start, and thats where below errors were found. Ideally rebalance >>>>> should not have happened at all. >>>>> >>>>> >>>>> Thanks >>>>> Zakee >>>>> >>>>> >>>>> >>>>>>> On Mar 9, 2015, at 10:28 AM, Zakee <kzak...@netzero.net >>>>>>> <mailto:kzak...@netzero.net>> wrote: >>>>>>> >>>>>>> Hmm, that sounds like a bug. Can you paste the log of leader rebalance >>>>>>> here? >>>>>> Thanks for you suggestions. >>>>>> It looks like the rebalance actually happened only once soon after I >>>>>> started with clean cluster and data was pushed, it didn’t happen again >>>>>> so far, and I see the partitions leader counts on brokers did not change >>>>>> since then. One of the brokers was constantly showing 0 for partition >>>>>> leader count. Is that normal? >>>>>> >>>>>> Also, I still see lots of below errors (~69k) going on in the logs >>>>>> since the restart. Is there any other reason than rebalance for these >>>>>> errors? >>>>>> >>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for >>>>>> partition [Topic-11,7] to broker 5:class >>>>>> kafka.common.NotLeaderForPartitionException >>>>>> (kafka.server.ReplicaFetcherThread) >>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for >>>>>> partition [Topic-2,25] to broker 5:class >>>>>> kafka.common.NotLeaderForPartitionException >>>>>> (kafka.server.ReplicaFetcherThread) >>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for >>>>>> partition [Topic-2,21] to broker 5:class >>>>>> kafka.common.NotLeaderForPartitionException >>>>>> (kafka.server.ReplicaFetcherThread) >>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for >>>>>> partition [Topic-22,9] to broker 5:class >>>>>> kafka.common.NotLeaderForPartitionException >>>>>> (kafka.server.ReplicaFetcherThread) >>>>>> >>>>>>> Some other things to check are: >>>>>>> 1. The actual property name is auto.leader.rebalance.enable, not >>>>>>> auto.leader.rebalance. You’ve probably known this, just to double >>>>>>> confirm. >>>>>> Yes >>>>>> >>>>>>> 2. In zookeeper path, can you verify /admin/preferred_replica_election >>>>>>> does not exist? >>>>>> ls /admin >>>>>> [delete_topics] >>>>>> ls /admin/preferred_replica_election >>>>>> Node does not exist: /admin/preferred_replica_election >>>>>> >>>>>> >>>>>> Thanks >>>>>> Zakee >>>>>> >>>>>> >>>>>> >>>>>>> On Mar 7, 2015, at 10:49 PM, Jiangjie Qin <j...@linkedin.com.INVALID >>>>>>> <mailto:j...@linkedin.com.INVALID>> >>>>>>> wrote: >>>>>>> >>>>>>> Hmm, that sounds like a bug. Can you paste the log of leader rebalance >>>>>>> here? >>>>>>> Some other things to check are: >>>>>>> 1. The actual property name is auto.leader.rebalance.enable, not >>>>>>> auto.leader.rebalance. You’ve probably known this, just to double >>>>>>> confirm. >>>>>>> 2. In zookeeper path, can you verify /admin/preferred_replica_election >>>>>>> does not exist? >>>>>>> >>>>>>> Jiangjie (Becket) Qin >>>>>>> >>>>>>>> On 3/7/15, 10:24 PM, "Zakee" <kzak...@netzero.net >>>>>>>> <mailto:kzak...@netzero.net>> wrote: >>>>>>>> >>>>>>>> I started with clean cluster and started to push data. It still does >>>>>>>> the >>>>>>>> rebalance at random durations even though the auto.leader.relabalance >>>>>>>> is >>>>>>>> set to false. >>>>>>>> >>>>>>>> Thanks >>>>>>>> Zakee >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On Mar 6, 2015, at 3:51 PM, Jiangjie Qin <j...@linkedin.com.INVALID >>>>>>>>> <mailto:j...@linkedin.com.INVALID>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Yes, the rebalance should not happen in that case. That is a little >>>>>>>>> bit >>>>>>>>> strange. Could you try to launch a clean Kafka cluster with >>>>>>>>> auto.leader.election disabled and try push data? >>>>>>>>> When leader migration occurs, NotLeaderForPartition exception is >>>>>>>>> expected. >>>>>>>>> >>>>>>>>> Jiangjie (Becket) Qin >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 3/6/15, 3:14 PM, "Zakee" <kzak...@netzero.net >>>>>>>>>> <mailto:kzak...@netzero.net>> wrote: >>>>>>>>>> >>>>>>>>>> Yes, Jiangjie, I do see lots of these errors "Starting preferred >>>>>>>>>> replica >>>>>>>>>> leader election for partitions” in logs. I also see lot of Produce >>>>>>>>>> request failure warnings in with the NotLeader Exception. >>>>>>>>>> >>>>>>>>>> I tried switching off the auto.leader.relabalance to false. I am >>>>>>>>>> still >>>>>>>>>> noticing the rebalance happening. My understanding was the rebalance >>>>>>>>>> will >>>>>>>>>> not happen when this is set to false. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Zakee >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Feb 25, 2015, at 5:17 PM, Jiangjie Qin >>>>>>>>>>> <j...@linkedin.com.INVALID <mailto:j...@linkedin.com.INVALID>> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> I don’t think num.replica.fetchers will help in this case. >>>>>>>>>>> Increasing >>>>>>>>>>> number of fetcher threads will only help in cases where you have a >>>>>>>>>>> large >>>>>>>>>>> amount of data coming into a broker and more replica fetcher >>>>>>>>>>> threads >>>>>>>>>>> will >>>>>>>>>>> help keep up. We usually only use 1-2 for each broker. But in your >>>>>>>>>>> case, >>>>>>>>>>> it looks that leader migration cause issue. >>>>>>>>>>> Do you see anything else in the log? Like preferred leader >>>>>>>>>>> election? >>>>>>>>>>> >>>>>>>>>>> Jiangjie (Becket) Qin >>>>>>>>>>> >>>>>>>>>>> On 2/25/15, 5:02 PM, "Zakee" <kzak...@netzero.net >>>>>>>>>>> <mailto:kzak...@netzero.net> >>>>>>>>>>> <mailto:kzak...@netzero.net <mailto:kzak...@netzero.net>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks, Jiangjie. >>>>>>>>>>>> >>>>>>>>>>>> Yes, I do see under partitions usually shooting every hour. >>>>>>>>>>>> Anythings >>>>>>>>>>>> that >>>>>>>>>>>> I could try to reduce it? >>>>>>>>>>>> >>>>>>>>>>>> How does "num.replica.fetchers" affect the replica sync? Currently >>>>>>>>>>>> have >>>>>>>>>>>> configured 7 each of 5 brokers. >>>>>>>>>>>> >>>>>>>>>>>> -Zakee >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Feb 25, 2015 at 4:17 PM, Jiangjie Qin >>>>>>>>>>>> <j...@linkedin.com.invalid <mailto:j...@linkedin.com.invalid>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> These messages are usually caused by leader migration. I think as >>>>>>>>>>>>> long >>>>>>>>>>>>> as >>>>>>>>>>>>> you don¹t see this lasting for ever and got a bunch of under >>>>>>>>>>>>> replicated >>>>>>>>>>>>> partitions, it should be fine. >>>>>>>>>>>>> >>>>>>>>>>>>> Jiangjie (Becket) Qin >>>>>>>>>>>>> >>>>>>>>>>>>>> On 2/25/15, 4:07 PM, "Zakee" <kzak...@netzero.net >>>>>>>>>>>>>> <mailto:kzak...@netzero.net>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Need to know if I should I be worried about this or ignore them. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I see tons of these exceptions/warnings in the broker logs, not >>>>>>>>>>>>>> sure >>>>>>>>>>>>> what >>>>>>>>>>>>>> causes them and what could be done to fix them. >>>>>>>>>>>>>> >>>>>>>>>>>>>> ERROR [ReplicaFetcherThread-3-5], Error for partition >>>>>>>>>>>>>> [TestTopic] >>>>>>>>>>>>>> to >>>>>>>>>>>>>> broker >>>>>>>>>>>>>> 5:class kafka.common.NotLeaderForPartitionException >>>>>>>>>>>>>> (kafka.server.ReplicaFetcherThread) >>>>>>>>>>>>>> [2015-02-25 11:01:41,785] ERROR [ReplicaFetcherThread-3-5], >>>>>>>>>>>>>> Error >>>>>>>>>>>>>> for >>>>>>>>>>>>>> partition [TestTopic] to broker 5:class >>>>>>>>>>>>>> kafka.common.NotLeaderForPartitionException >>>>>>>>>>>>>> (kafka.server.ReplicaFetcherThread) >>>>>>>>>>>>>> [2015-02-25 11:01:41,785] WARN [Replica Manager on Broker 2]: >>>>>>>>>>>>>> Fetch >>>>>>>>>>>>>> request >>>>>>>>>>>>>> with correlation id 950084 from client ReplicaFetcherThread-1-2 >>>>>>>>>>>>>> on >>>>>>>>>>>>>> partition [TestTopic,2] failed due to Leader not local for >>>>>>>>>>>>>> partition >>>>>>>>>>>>>> [TestTopic,2] on broker 2 (kafka.server.ReplicaManager) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any ideas? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Zakee >>>>>>>>>>>>>> ____________________________________________________________ >>>>>>>>>>>>>> Next Apple Sensation >>>>>>>>>>>>>> 1 little-known path to big profits >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://thirdpartyoffers.netzero.net/TGL3231/54ee63b9e704b63b94061 >>>>>>>>>>>>>> >>>>>>>>>>>>>> <http://thirdpartyoffers.netzero.net/TGL3231/54ee63b9e704b63b94061> >>>>>>>>>>>>>> st0 >>>>>>>>>>>>>> 3v >>>>>>>>>>>>>> uc >>>>>>>>>>>>> >>>>>>>>>>>>> ____________________________________________________________ >>>>>>>>>>>>> Extended Stay America >>>>>>>>>>>>> Get Fantastic Amenities, low rates! Kitchen, Ample Workspace, >>>>>>>>>>>>> Free >>>>>>>>>>>>> WIFI >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> http://thirdpartyoffers.netzero.net/TGL3255/54ee66f26da6f66f10ad4m >>>>>>>>>>>>> >>>>>>>>>>>>> <http://thirdpartyoffers.netzero.net/TGL3255/54ee66f26da6f66f10ad4m> >>>>>>>>>>>>> p02 >>>>>>>>>>>>> du >>>>>>>>>>>>> c >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ____________________________________________________________ >>>>>>>>>>> Extended Stay America >>>>>>>>>>> Official Site. Free WIFI, Kitchens. Our best rates here, >>>>>>>>>>> guaranteed. >>>>>>>>>>> >>>>>>>>>>> http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13d >>>>>>>>>>> >>>>>>>>>>> <http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13d> >>>>>>>>>>> uc >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> <http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13 >>>>>>>>>>> duc >>>>>>>>> >>>>>>>>> >>>>>>>>> ____________________________________________________________ >>>>>>>>> The WORST exercise for aging >>>>>>>>> Avoid this "healthy" exercise to look & feel 5-10 years >>>>>>>>> YOUNGER >>>>>>>>> >>>>>>>>> http://thirdpartyoffers.netzero.net/TGL3255/54fa40e98a0e640e81196mp07d >>>>>>>>> >>>>>>>>> <http://thirdpartyoffers.netzero.net/TGL3255/54fa40e98a0e640e81196mp07d> >>>>>>>>> uc >>>>>>> >>>>>>> >>>>>>> ____________________________________________________________ >>>>>>> Seabourn Luxury Cruises >>>>>>> Receive special offers from the World's Finest Small-Ship Cruise >>>>>>> Line! >>>>>>> >>>>>>> http://thirdpartyoffers.netzero.net/TGL3255/54fbf3b0f058073b02901mp14duc >>>>>>> >>>>>>> <http://thirdpartyoffers.netzero.net/TGL3255/54fbf3b0f058073b02901mp14duc> >>>> >>>> >>>> ____________________________________________________________ >>>> Discover Seabourn >>>> A journey as beautiful as the destination, request a brochure today! >>>> http://thirdpartyoffers.netzero.net/TGL3255/54fdebfe6a2a36bfb0bb3mp10duc >>>> <http://thirdpartyoffers.netzero.net/TGL3255/54fdebfe6a2a36bfb0bb3mp10duc> >>> >>> >>> Thanks >>> Zakee >>> >>> >>> >>> ____________________________________________________________ >>> Want to place your ad here? >>> Advertise on United Online >>> http://thirdpartyoffers.netzero.net/TGL3255/54fdf80bc575a780b0397mp05duc >> > ____________________________________________________________ > What's your flood risk? > Find flood maps, interactive tools, FAQs, and agents in your area. > http://thirdpartyoffers.netzero.net/TGL3255/5504cccfca43a4ccf0a56mp08duc > <http://thirdpartyoffers.netzero.net/TGL3255/5504cccfca43a4ccf0a56mp08duc>