Re: No Heartbeat request on commit

Jason Gustafson Thu, 28 Apr 2016 09:30:05 -0700

Ah, yeah. That's probably caused by the new topic metadata version, which
isn't supported on 0.9 brokers. To test on trunk, you'd have to upgrade the
brokers as well. Either that or you can rewind to before KAFKA-3306 (which
was just committed the day before yesterday)?


-Jason

On Thu, Apr 28, 2016 at 9:01 AM, vinay sharma <[email protected]>
wrote:

> Hi Jason,
>
> I build kafka-client and tried using it but my producers and consumers
> started throwing below exception. Is 0.10 not going to be compatible with
> brokers on version 0.9.0.1? or do i need to make some config changes to
> producers / consumers to make them compatible with brokers on old version?
> or do i need to upgrade brokers to new version as well?
>
>  org.apache.kafka.common.protocol.types.SchemaException: Error reading
> field 'brokers': Error reading field 'host': Error reading string of length
> 17995, only 145 bytes available
> at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:75)
> at
>
> org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:380)
> at
>
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:449)
>
> Regards,
> Vinay Sharma
>
> On Thu, Apr 28, 2016 at 12:32 AM, Jason Gustafson <[email protected]>
> wrote:
>
> > Hey Vinay,
> >
> > Any chance you can run the same test against trunk? I'm guessing this
> might
> > be caused by a bug in the 0.9 consumer which basically causes some
> requests
> > to fail when a bunch of them are sent to the broker at the same time.
> >
> > -Jason
> >
> > On Wed, Apr 27, 2016 at 1:02 PM, vinay sharma <[email protected]>
> > wrote:
> >
> > > Hi Jason,
> > >
> > > This makes sense.We use 0.9.0.1 and we do have session timeout set a
> bit
> > > high but nothing can guarantee that there will be no case when
> processing
> > > may not go higher than session timeout. I am trying to test a proactive
> > > commit approach to handle such cases when processing takes unusually
> long
> > > time. To keep consumer's session alive during long processing time i
> > > proactively commitSync processed records every 15 seconds. Session
> > timeout
> > > i kept is 30000.
> > >
> > > *Problem:-*
> > > With heart beat interval is 3000 then i expect a hearbeat request to be
> > > sent on each proactive commit which happens every 15 seconds. In my
> > tests i
> > > see that this does not happen always. I see a time window which is
> > greater
> > > than 30 seconds where no hearbeat is sent even thought there were
> commits
> > > in this duration. After this window i see a couple of successful
> > heartbeat
> > > responses till the end of poll but as soon as i poll again and call
> > > commitSync in next poll i get "ILLEGAL_GENERATION" error. This error
> > always
> > > happen just after meta refresh or in next poll processing after a meta
> > > refresh. I am attaching logs where i kept meta refresh interval 40000,
> > > 90000, 500000.
> > >
> > > *Test results *:-
> > > Test with meta refresh 40000 ms ran around 70 seconds from 1st poll.
> > > Test with meta refresh 90000 ms ran around 120 seconds from 1st poll.
> > > Test with meta refresh 500000 ms ran around 564 seconds from 1st poll.
> > >
> > > Every test falls in line with above test cases where generation is
> marked
> > > dead some time after a meta refresh. Meta refresh before 1st poll does
> > not
> > > create any issue but the ones after poll and during long processing do.
> > >
> > > *Environment:-*
> > > My setup has 3 brokers 1 zk. Topic has 3 partitions ans has replication
> > > factor 3. Messages are already published to topic.
> > >
> > > *Logic used in test cases :- *
> > > On each poll I initialize a map with current committed offset position
> of
> > > partitions being consumed. I update this map after each record
> processing
> > > and use this map to proactively commit every 15 seconds. Map is
> > initialized
> > > again after a proactive commit.
> > >
> > > I am not sure what is wrong here but i do not see any issue in code or
> > > offset commits going on. Log files and a class with main method are
> > > attached for your reference.
> > >
> > > Regards,
> > > Vinay Sharma
> > >
> > >
> > >
> > > On Wed, Apr 27, 2016 at 2:46 PM, Jason Gustafson <[email protected]>
> > > wrote:
> > >
> > >> Hi Vinay,
> > >>
> > >> Answers below:
> > >>
> > >> 1)  Is it correct to say that each commitSync will trigger a
> > >> HeartBeatTask?
> > >> > If there is no hear beat sent in past since specified heartbeat
> > interval
> > >> > then i should see a successful heartbeat response or failure message
> > in
> > >> > logs near to commitSync success log?
> > >>
> > >>
> > >> Not quite. Heartbeats are sent periodically according to the
> > >> heartbeat.interval.ms configuration. However, since the consumer has
> no
> > >> background thread, they can only be sent in API calls such as poll()
> or
> > >> commitSync(). So calling commitSync() may or may not result in a
> > heartbeat
> > >> depending only on whether one is "due."
> > >>
> > >> 2) is it correct to say that Meta Data refresh will not act as
> > heartbeat,
> > >> > will not trigger heartBeatTask and will not reset heartBeatTask?
> > >>
> > >>
> > >> That is correct. Metadata refreshes are not related to heartbeats.
> > >>
> > >> 3) Where does a consumer session maintained? Lets say my consumer is
> > >> > listening to 3 partitions on a 3 broker cluster where each broker is
> > >> leader
> > >> > of 1 partition. So will each of the brokers will have a session for
> my
> > >> > consumer or is it just 1 session maintained somewhere in common like
> > >> > zookeeper?
> > >>
> > >>
> > >> One of the brokers serves as the "group coordinator." When the
> consumer
> > >> starts up, it sends a GroupCoordinator request to one of the brokers
> to
> > >> find out who the coordinator is. Currently, coordinators are chosen
> from
> > >> among the leaders of the partitions of the __consumer_offsets topic.
> > This
> > >> lets us take advantage of the leader election process to also handle
> > >> coordinator failures. The coordinator of each group maintains state
> for
> > >> the
> > >> group and keeps track of session timeouts.
> > >>
> > >> 4) In above setup, during a long processing if I commit a record
> through
> > >> > commmitSync which triggers a hear beat request and a successful
> > >> response is
> > >> > received for the same then what does this response means? does it
> mean
> > >> that
> > >> > my session with each broker is renewed? or does it mean that just
> the
> > >> > leader for partition of committed record knows that my consumer is
> > alive
> > >> > and consumer's session on other brokers will still timeout?
> > >>
> > >>
> > >> The coordinator is the only broker that is aware of a consumer's
> session
> > >> and all offset commits are sent to it. Successful heartbeats mean that
> > the
> > >> session is still active. Heartbeats are also used to let the consumer
> > >> discover when a rebalance has begun. If a new member joins the group,
> > then
> > >> the coordinator returns an error code in the heartbeat responses of
> the
> > >> active members to let them know that they need to rejoin the group so
> > that
> > >> partitions can be rebalanced.
> > >>
> > >> I wouldn't get too hung up on commit/heartbeat behavior. The crux of
> the
> > >> issue is that you need to call poll() often enough to avoid getting
> > timed
> > >> out by the coordinator. If you find this happening frequently, you
> > >> probably
> > >> need to increase session.timeout.ms. There's not really any downside
> to
> > >> doing so other than that hard failures (in which the consumer can't be
> > >> shutdown cleanly) will take a little longer to detect. Normal shutdown
> > >> doesn't have this problem. It can be difficult in 0.9 to ensure that
> > >> poll()
> > >> is called often enough since you don't have direct control over the
> > amount
> > >> of data returned in poll(), but we're adding an option
> > (max.poll.records)
> > >> in 0.10 which hopefully can be set conservatively enough to make this
> > >> problem go away.
> > >>
> > >> -Jason
> > >>
> > >> On Wed, Apr 27, 2016 at 7:11 AM, vinay sharma <
> [email protected]
> > >
> > >> wrote:
> > >>
> > >> > Hey,
> > >> >
> > >> > I am working on a simplified test case to check if there is any
> issue
> > >> in my
> > >> > code. Just to make sure that any of my assumptions are not wrong, it
> > >> will
> > >> > be great if you can please help me in finding answers to following
> > >> > queries:-
> > >> >
> > >> > 1)  Is it correct to say that each commitSync will trigger a
> > >> HeartBeatTask?
> > >> > If there is no hear beat sent in past since specified heartbeat
> > interval
> > >> > then i should see a successful heartbeat response or failure message
> > in
> > >> > logs near to commitSync success log?
> > >> > 2) is it correct to say that Meta Data refresh will not act as
> > >> heartbeat,
> > >> > will not trigger heartBeatTask and will not reset heartBeatTask?
> > >> > 3) Where does a consumer session maintained? Lets say my consumer is
> > >> > listening to 3 partitions on a 3 broker cluster where each broker is
> > >> leader
> > >> > of 1 partition. So will each of the brokers will have a session for
> my
> > >> > consumer or is it just 1 session maintained somewhere in common like
> > >> > zookeeper?
> > >> > 4) In above setup, during a long processing if I commit a record
> > through
> > >> > commmitSync which triggers a hear beat request and a successful
> > >> response is
> > >> > received for the same then what does this response means? does it
> mean
> > >> that
> > >> > my session with each broker is renewed? or does it mean that just
> the
> > >> > leader for partition of committed record knows that my consumer is
> > alive
> > >> > and consumer's session on other brokers will still timeout?
> > >> >
> > >> > Regards,
> > >> > Vinay Sharma
> > >> >
> > >> > On Tue, Apr 26, 2016 at 2:38 PM, Jason Gustafson <
> [email protected]>
> > >> > wrote:
> > >> >
> > >> > > Hey Vinay,
> > >> > >
> > >> > > Are you saying that heartbeats are not sent while a metadata
> refresh
> > >> is
> > >> > in
> > >> > > progress? Do you have any logs which show us the apparent problem?
> > >> > >
> > >> > > Thanks,
> > >> > > Jason
> > >> > >
> > >> > > On Tue, Apr 26, 2016 at 8:18 AM, vinay sharma <
> > >> [email protected]>
> > >> > > wrote:
> > >> > >
> > >> > > > Hi Ismael,
> > >> > > >
> > >> > > > Treating commitSync as heartbeat will definitely resolve the
> issue
> > >> i am
> > >> > > > facing but the reason behind my issue does not seem to be what
> > >> > mentioned
> > >> > > in
> > >> > > > defect (i.e frequent commitSync requests).
> > >> > > >
> > >> > > > I am sending CommitSync periodically only to keep my session
> alive
> > >> when
> > >> > > my
> > >> > > > consumer is still processing records and is close to session
> time
> > >> out
> > >> > > > (tried 10th / 12th / 15th / 20th second after poll called where
> > >> session
> > >> > > > time is 30). I see heartbeat response received in logs along
> with
> > >> each
> > >> > > > commitSync call but this stops after a meta data refresh request
> > is
> > >> > > issued.
> > >> > > > I see in logs that commit goes successful but no heartbeat
> > response
> > >> > > > received message in logs after meta refresh till next poll.
> > >> > > >
> > >> > > > Regards,
> > >> > > > Vinay Sharma
> > >> > > >
> > >> > > > On Mon, Apr 25, 2016 at 5:06 PM, Ismael Juma <[email protected]
> >
> > >> > wrote:
> > >> > > >
> > >> > > > > Hi Vinay,
> > >> > > > >
> > >> > > > > This was fixed via
> > >> https://issues.apache.org/jira/browse/KAFKA-3470
> > >> > > > (will
> > >> > > > > be part of 0.10.0.0).
> > >> > > > >
> > >> > > > > Ismael
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > On Mon, Apr 25, 2016 at 1:52 PM, vinay sharma <
> > >> > > [email protected]>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Hello,
> > >> > > > > >
> > >> > > > > > I am using client API 0.9.0.1 and facing an issue. As per my
> > >> logs
> > >> > it
> > >> > > > > seems
> > >> > > > > > that on each commitSync(Offsets) a heartbeat request is sent
> > but
> > >> > > after
> > >> > > > a
> > >> > > > > > metada refresh request till next poll(), commits do not send
> > any
> > >> > > > hearbeat
> > >> > > > > > request.
> > >> > > > > >
> > >> > > > > > KafkaConsumers i create sometimes get session time out due
> to
> > no
> > >> > > > hearbeat
> > >> > > > > > specially during longer processing times. I call
> > >> > CommitSync(offsets)
> > >> > > > > after
> > >> > > > > > regular intervals to keep session alive when processing
> takes
> > >> > longer
> > >> > > > than
> > >> > > > > > usual. Every thing works fine if commit intervals are very
> > >> small or
> > >> > > if
> > >> > > > i
> > >> > > > > > commit after each record but if i commit lets say every 12
> > >> seconds
> > >> > > and
> > >> > > > 30
> > >> > > > > > seconds is session time then i can see consumer getting
> timed
> > >> out
> > >> > > > > > sometimes.
> > >> > > > > >
> > >> > > > > > Any help or pointers will be much appreciated. Thanks in
> > >> advance.
> > >> > > > > >
> > >> > > > > > Regards,
> > >> > > > > > Vinay sharma
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: No Heartbeat request on commit

Reply via email to