Hi Sampath Maybe need to check log to see if can find any clue, hard to say why those happened.
Best, Lisheng sampath kumar <sampath...@gmail.com> 于2019年8月22日周四 上午2:24写道: > Hi Lisheng, > > I guess the issue is not with message.max.bytes, same messages consumer > after just running rebalancing. > > Regards, > Sampath > > On Wed, Aug 21, 2019 at 7:57 PM Lisheng Wang <wanglishen...@gmail.com> > wrote: > > > Hi Sampath > > > > the description of fetch.max.bytes is following from > > https://kafka.apache.org/documentation/#consumerconfigs > > > > The maximum amount of data the server should return for a fetch request. > > Records are fetched in batches by the consumer, and if the first record > > batch in the first non-empty partition of the fetch is larger than this > > value, the record batch will still be returned to ensure that the > consumer > > can make progress. As such, this is not a absolute maximum. The maximum > > record batch size accepted by the broker is defined via > > message.max.bytes (broker > > config) or max.message.bytes (topic config). Note that the consumer > > performs multiple fetches in parallel. > > > > the point is "if the first record batch in the first non-empty partition > of > > the fetch is larger than this value, the record batch will still be > > returned to ensure that the consumer can make progress". > > > > just wanna make sure the issue you are facing is not related to that. > > > > > > Best, > > Lisheng > > > > > > sampath kumar <sampath...@gmail.com> 于2019年8月21日周三 下午8:36写道: > > > > > Lisheng , > > > > > > Issue not with fetch max bytes as same message start processing after > > > restarting the consumer > > > > > > Regards, > > > Sampath > > > > > > On Wed, Aug 21, 2019 at 4:30 PM Lisheng Wang <wanglishen...@gmail.com> > > > wrote: > > > > > > > Hi Sampath > > > > > > > > Can you confirm that "fetch.max.bytes" on consumer is not smaller > than > > > > "message.max.bytes" on broker? > > > > > > > > Maybe need you check consumer log to see if can find any clue once > you > > > > enable it. if no any error/exception found on consumer side, maybe > need > > > > change log level to "debug" to get more detail information. > > > > > > > > Best, > > > > Lisheng > > > > > > > > > > > > sampath kumar <sampath...@gmail.com> 于2019年8月21日周三 下午6:30写道: > > > > > > > > > Hi Lisheng, > > > > > > > > > > Thanks for the response. > > > > > > > > > > Right now we have enabled info in the broker However logs not > enabled > > > for > > > > > the consumer client will enable it. > > > > > > > > > > Yes, when we manually stop and start the consumer in affected > > > > microservice > > > > > instance rebalance triggers and consuming resumes. > > > > > > > > > > And in Broker side consumer client status is healthy we verified > both > > > in > > > > > Kafka Manager and Consumer Group Command in Broker CLI, so I guess > > > > > heartbeat is not the issue and this issue not affected to not > > complete > > > > > consumer group only some consumer client in the couple of > > microservice. > > > > > > > > > > Forex: if one of the Service if we have 38 consumer client/thread > > > > > registered for the consumer group, only 1 client not receiving the > > > > > messages, rest all getting the messages > > > > > > > > > > Anything else you want me to check here? > > > > > > > > > > Regards, > > > > > Sampath > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Aug 21, 2019 at 3:21 PM Lisheng Wang < > > wanglishen...@gmail.com> > > > > > wrote: > > > > > > > > > > > May i know what log level did you configured on consumer and > > broker? > > > > > > you say it will resume when rebalance happen, so consumer is > alive, > > > > can > > > > > > you see any heartbeat information in consumer log? > > > > > > > > > > > > Best, > > > > > > Lisheng > > > > > > > > > > > > > > > > > > sampath kumar <sampath...@gmail.com> 于2019年8月21日周三 下午5:23写道: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > Using Broker 5.3.0, new consumers(Consumers managed by > brokers). > > > > > Brokers > > > > > > > are deployed in a Kubernetes environment > > > > > > > > > > > > > > Number of brokers : 3, Number of 3 Zookeeper setup > > > > > > > > > > > > > > One of the Topic "inventory.request" we have 3 replication, > with > > > > insync > > > > > > > replicas configured as 2 and partition count is 1024 > > > > > > > > > > > > > > We have 20 instances of microservice subscribe to the above > > topic, > > > > each > > > > > > > instance will have 48 consumers registered as a group > > > > "agent.group.inv" > > > > > > > > > > > > > > Issue : > > > > > > > > > > > > > > Here some times a couple of the consumers suddenly stopped > > > receiving > > > > > the > > > > > > > request, and lag seems to keep increasing. Only option to > recover > > > it > > > > > > > restart the consumers and invokes rebalancing > > > > > > > > > > > > > > ``` agent.group.inv inventory.request 543 17423 > > > > 17612 > > > > > > > 189 > > > > > agent19.inv.35-6e6eb252-8d26-489b-8d7f-53b25f182f30 > > > > > > > /10.200.187.103 agent19.inv.35 ``` > > > > > > > > > > > > > > we checked the thread dump of the consumer, the consumer keeps > > > > > performing > > > > > > > polling and assigned with partitions, However not receiving the > > any > > > > > > > messages > > > > > > > > > > > > > > ``` "inventory.request-agent19.inv.35" #499 prio=1 os_prio=4 > > > > > > > tid=0x00007f88a855b000 nid=0x389 runnable [0x00007f87e8be6000] > > > > > > > java.lang.Thread.State: RUNNABLE > > > > > > > at sun.nio.ch.EPollArrayWrapper.epollWait(Native > Method) > > > > > > > at > > > > > sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > > > > > > > at > > > > > > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) > > > > > > > at > > > > > sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) > > > > > > > - locked <0x00000000aa502730> (a sun.nio.ch.Util$3) > > > > > > > - locked <0x00000000aa5026b0> (a > > > > > > > java.util.Collections$UnmodifiableSet) > > > > > > > - locked <0x00000000aa502668> (a > > > > sun.nio.ch.EPollSelectorImpl) > > > > > > > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) > > > > > > > at > > > > > > > > > org.apache.kafka.common.network.Selector.select(Selector.java:794) > > > > > > > at > > > > > > org.apache.kafka.common.network.Selector.poll(Selector.java:467) > > > > > > > at > > > > > > > > > org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:539) > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262) > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233) > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1281) > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1159) > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > No errors are observed consumer client, brokers and also > resource > > > > issue > > > > > > not > > > > > > > seen. > > > > > > > > > > > > > > Can you please help us in identifying the root cause for this > > > > consumer > > > > > > > client behavior? > > > > > > > > > > > > > > Please let me know if any other details required? > > > > > > > > > > > > > > -- > > > > > > > Regards, > > > > > > > Sampath > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Regards, > > > > > Sampath > > > > > > > > > > > > > > > > > > -- > > > Regards, > > > Sampath > > > > > > > > -- > Regards, > Sampath >