Re: KIP-41: KafkaConsumer Max Records

Jason Gustafson Mon, 04 Jan 2016 10:38:42 -0800

Gwen and Ismael,

I agree the configuration option is probably the way to go, but I was
wondering whether there would be cases where it made sense to let the
consumer dynamically set max messages to adjust for downstream slowness.
For example, if the consumer is writing consumed records to another
database, and that database is experiencing heavier than expected load,
then the consumer could halve its current max messages in order to adapt
without risking rebalance hell. It could then increase max messages as the
load on the database decreases. It's basically an easier way to handle flow
control than we provide with pause/resume.


-Jason

On Mon, Jan 4, 2016 at 9:46 AM, Gwen Shapira <g...@confluent.io> wrote:

> The wiki you pointed to is no longer maintained and fell out of sync with
> the code and protocol.
>
> You may want  to refer to:
>
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>
> On Mon, Jan 4, 2016 at 4:38 AM, Jens Rantil <jens.ran...@tink.se> wrote:
>
> > Hi guys,
> >
> > I realized I never thanked yall for your input - thanks!
> > Jason: I apologize for assuming your stance on the issue! Feels like we
> all
> > agreed on the solution. +1
> >
> > Follow-up: Jason made a point about defining prefetch and fairness
> > behaviour in the KIP. I am now working on putting that down in writing.
> To
> > do be able to do this I think I need to understand the current prefetch
> > behaviour in the new consumer API (0.9) a bit better. Some specific
> > questions:
> >
> >    - How does a specific consumer balance incoming messages from multiple
> >    partitions? Is the consumer simply issuing Multi-Fetch requests[1] for
> > the
> >    consumed assigned partitions of the relevant topics? Or is the
> consumer
> >    fetching from one partition at a time and balancing between them
> >    internally? That is, is the responsibility of partition balancing (and
> >    fairness) on the broker side or consumer side?
> >    - Is the above documented somewhere?
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Writing+a+Driver+for+Kafka
> > ,
> > see "Multi-Fetch".
> >
> > Thanks,
> > Jens
> >
> > On Wed, Dec 23, 2015 at 2:44 AM, Ismael Juma <ism...@juma.me.uk> wrote:
> >
> > > On Wed, Dec 23, 2015 at 1:24 AM, Gwen Shapira <g...@confluent.io>
> wrote:
> > >
> > > > Given the background, it sounds like you'll generally want each call
> to
> > > > poll() to return the same number of events (which is the number you
> > > planned
> > > > on having enough memory / time for). It also sounds like tuning the
> > > number
> > > > of events will be closely tied to tuning the session timeout. That
> is -
> > > if
> > > > I choose to lower the session timeout for some reason, I will have to
> > > > modify the number of records returning too.
> > > >
> > > > If those assumptions are correct, I think a configuration makes more
> > > sense.
> > > > 1. We are unlikely to want this parameter to be change at the
> lifetime
> > of
> > > > the consumer
> > > > 2. The correct value is tied to another configuration parameter, so
> > they
> > > > will be controlled together.
> > > >
> > >
> > > I was thinking the same thing.
> > >
> > > Ismael
> > >
> >
> >
> >
> > --
> > Jens Rantil
> > Backend engineer
> > Tink AB
> >
> > Email: jens.ran...@tink.se
> > Phone: +46 708 84 18 32
> > Web: www.tink.se
> >
> > Facebook <https://www.facebook.com/#!/tink.se> Linkedin
> > <
> >
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> > >
> >  Twitter <https://twitter.com/tink>
> >
>

Re: KIP-41: KafkaConsumer Max Records

Reply via email to