Re: Kafka Consumer thoughts

Jay Kreps Fri, 31 Jul 2015 12:21:13 -0700

I like all these ideas.

Our convention is to keep method names declarative so it should probably be
  subscribe(List<String> topics, Callback c)
  assign(List<TopicPartition)


The javadoc would obviously have to clarify the relationship between a
subscribed topic and assigned partitions. Presumably unsubscribe/unassign
are unnecessary since this is just a matter of subscribing to the empty
list.

-Jay

On Fri, Jul 31, 2015 at 11:29 AM, Jason Gustafson <ja...@confluent.io>
wrote:

> I was thinking a little bit this morning about the subscription API and I
> have a few ideas on how to address some of the concerns about intuitiveness
> and exception handling.
>
> 1. Split the current notion of topic/partition subscription into
> subscription of topics and assignment of partitions. These concepts are
> pretty fundamentally different and I think at least some of the confusion
> about when subscriptions() can be used is caused by the fact that we
> overload the term. If instead that method is renamed to assignment(), then
> we are communicating to users that it is possible to have a subscription
> without an active assignment, which is not obvious with the current API.
> The code in fact already separates these concepts internally, so this would
> just expose it to the user.
>
> 2. Merge rebalance callback into a subscription callback and add method a
> way to handle errors. The consumer's current rebalance callback is
> basically invoked when a subscription "succeeds," so it seems a little
> weird to also provide a callback on subscription. Perhaps we can just take
> the rebalance callback out of configuration and have the user provide it on
> subscribe(). We can add a method to the callback to handle errors (e.g. for
> non-existing topics). Since the callback is provided at subscribe time, it
> should be clearer to the user that the assignment will not be ready
> immediately when subscribe returns. It's also arguably a little more
> natural to set this callback at subscription time rather than when the
> consumer is constructed.
>
> 3. Get rid of the additive subscribe methods and just use setSubscription
> which would clear the old subscription. After you start providing callbacks
> to subscribe, then the implementation starts to get tricky if each call to
> subscribe provides a separate callback. Instead, as Jay suggested, we could
> just provide a way to set the full list of subscriptions at once, and then
> there is only one callback to maintain.
>
> With these points, the API might look something like this:
>
> void setSubscription(List<String> topics, RebalanceCallback callback);
> void setAssignment(List<TopicPartition> partitions);
> List<String> subscription();
> List<TopicPartition> assignment();
>
> interface RebalanceCallback {
>   void onAssignment(List<TopicPartition> partitions);
>   void onRevocation(List<TopicPartition> partitions);
>
>   // handle non-existing topics, etc.
>   void onError(Exception e);
> }
>
> Any thoughts?
>
> -Jason
>
>
>
> On Thu, Jul 30, 2015 at 11:59 AM, Jay Kreps <j...@confluent.io> wrote:
>
>> Hey Becket,
>>
>> Yeah the high-level belief here is that it is possible to give something
>> as high level as the existing "high level" consumer, but this is not likely
>> to be the end-all be-all of high-level interfaces for processing streams of
>> messages. For example neither of these interfaces handles the threading
>> model for the processing, which obviously is a fairly low-level
>> implementation detail left to the user in you proposal, the current code,
>> as well as the existing scala consumer.
>>
>> There will be many of these: the full-fledged stream processing
>> frameworks like Storm/Spark, scalaz streams, the RxJava stuff, a more
>> traditional message queue like "processor" interface, not to mention the
>> stuff we're trying to do with KIP-28. For these frameworks it will be quite
>> weird to add a bunch of new threads since they will want to dictate the
>> threading model.
>>
>> What will be a major failure though is if this client isn't low-level
>> enough and we need to introduce another layer underneath. This would happen
>> either because we dictate too much to make it usable for various
>> applications, frameworks, or use cases. This is the concern with dictating
>> threading and processing models.
>>
>> So to summarize the goal is to subsume the existing APIs, which I think
>> we all agree this does, and be a foundation on which to build other
>> abstractions.
>>
>> WRT KIP-28, I think it is quite general and if we do that right it will
>> subsume a lot of the higher level processing and will give a full threaded
>> processing model to the user.
>>
>>
>> -Jay
>>
>>
>> On Wed, Jul 29, 2015 at 6:25 PM, Jiangjie Qin <j...@linkedin.com> wrote:
>>
>>> Thanks for the comments Jason and Jay.
>>>
>>> Jason, I had the same concern for producer's callback as well before,
>>> but it seems to be fine from some callbacks I wrote - user can always pass
>>> in object in the constructor if necessary for synchronization.
>>>
>>> Jay, I agree that the current API might be fine for people who wants to
>>> wrap it up. But I thought the new consumer was supposed to be a combination
>>> of old high and low level consumer, which means it should be able to be
>>> used as is, just like producer. If KafkaConsumer is designed to be wrapped
>>> up for use, then the question becomes whether Kafka will provide a decent
>>> wrapper or not? Neha mentioned that KIP-28 will address the users who only
>>> care about data. Would that be the wrapper provided by Kafka? I am not sure
>>> if that is sufficient though because the processor is highly abstracted,
>>> and might only meet the static data stream requirement as I listed in the
>>> grid. For users who need something from the other grids, are we going to
>>> have another wrapper? Or are we expecting all the user to write their own
>>> wrapper for KafkaConsumer? Some other comments are in line.
>>>
>>> Thanks,
>>>
>>> Jiangjie (Becket) Qin
>>>
>>> On Wed, Jul 29, 2015 at 3:16 PM, Jay Kreps <j...@confluent.io> wrote:
>>>
>>>> Some comments on the proposal:
>>>>
>>>> I think we are conflating a number of things that should probably be
>>>> addressed individually because they are unrelated. My past experience is
>>>> that this always makes progress hard. The more we can pick apart these
>>>> items the better:
>>>>
>>>>    1. threading model
>>>>    2. blocking vs non-blocking semantics
>>>>    3. missing apis
>>>>    4. missing javadoc and other api surprises
>>>>    5. Throwing exceptions.
>>>>
>>>> The missing APIs are getting added independently. Some like your
>>>> proposed offsetByTime where things we agreed to hold off on for the first
>>>> release and do when we'd thought it through. If there are uses for it now
>>>> we can accelerate. I think each of these is really independent, we know
>>>> there are things that need to be added but lumping them all into one
>>>> discussion will be confusing.
>>>>
>>>> WRT throwing exceptions the policy is to throw exceptions that are
>>>> unrecoverable and handle and log other exceptions that are transient. That
>>>> policy makes sense if you go through the thought exercise of "what will the
>>>> user do if i throw this exception to them" if they have no other rational
>>>> response but to retry (and if failing to anticipate and retry with that
>>>> exception will kill their program) . You can argue whether the topic not
>>>> existing is transient or not, unfortunately the way we did auto-creation
>>>> makes it transient if you are in "auto create mode" and non-transient
>>>> otherwise (ick!). In any case this is an orthogonal discussion to
>>>> everything else. I think the policy is right and if we don't conform to it
>>>> in some way that is really an independent bug/discussion.
>>>>
>>> Agreed we can discuss about them separately.
>>>
>>>>
>>>> I suggest we focus on threading and the current event-loop style of api
>>>> design since I think that is really the crux.
>>>>
>>>> The analogy between the producer threading model and the consumer model
>>>> actually doesn't work for me. The goal of the producer is actually to take
>>>> requests from many many user threads and shove them into a single buffer
>>>> for batching. So the threading model isn't the 1:1 threads you describe it
>>>> is N:1.The goal of the consumer is to support single-threaded processing.
>>>> This is what drives the difference. Saying that the producer has N:1
>>>> threads therefore for the consumer should have 1:1 threads instead of just
>>>> 1 thread doesn't make sense any more then an analogy to the brokers
>>>> threading model would--the problem we're solving is totally different.
>>>>
>>> I think the ultimate goal for producer and consumer are still allowing
>>> user to send/receive data in parallel. In producer we picked the solution
>>> of one-producer-serving-multiple-threads, and in consumer we picked
>>> multiple-single-threaded-consumers instead of
>>> single-consumer-serving-multiple threads. And we believe people can always
>>> implement the latter with the former. I think this is a reasonable
>>> decision. However, there are also reasonable concerns over the
>>> multiple-single-threaded-consumers solution which is that the single-thread
>>> might have to be a dedicate polling thread in many cases which pushes user
>>> towards the other solution - i.e. implementing a
>>> single-thread-consumer-serving-multiple-threads wrapper. From what we hear,
>>> it seems to be a quite common concern for most of the users we talked to.
>>> Plus the adoption bar of the consumer will be much higher because user will
>>> have to understand some of the details of the things they don't care as
>>> listed in the grid.
>>> The analogy between producer/consumer is intended to show that a
>>> separate polling thread will solve the concerns we have.
>>>
>>>>
>>>>
>>> I think ultimately though what you need to think about is, does an event
>>>> loop style of API make sense? That is the source of all the issues you
>>>> describe. This style of API is incredibly prevalent from unix select to
>>>> GUIs to node.js. It's a great way to model multiple channels of messages
>>>> coming in. It is a fantastic style for event processing. Programmers
>>>> understand this style of api though I would agree it is unusual compared to
>>>> blocking apis. But it is is a single threaded processing model. The current
>>>> approach is basically a pure event loop with some convenience methods that
>>>> are effectively "poll until X is complete".
>>>>
>>>> I think basically all the confusion you are describing comes from not
>>>> documenting/expecting an event loop. The "if you don't call poll nothing
>>>> happens" point is basically this. It's an event loop. You have to loop. You
>>>> can't not call poll. The docs don't cover this right now, perhaps. I think
>>>> if they do it's not unreasonable behavior.
>>>>
>>> I'm not sure if I understand the event-loop correctly and honestly I did
>>> not think about it clearly before. My understanding is that an even-loop
>>> model means a single listener thread, but there can be multiple event
>>> generator threads. The downside is that the listener thread has to be fast
>>> and very careful about blocking. If we look at the consumer, the current
>>> model is the caller thread itself act as both event generator and listener.
>>> As a generator, it generates different task by calling the convenience
>>> methods. As a listener, it listens to the messages on broker and also the
>>> tasks generated by itself. So in our proposal, we are not changing the
>>> event-loop model here just separated the event generator and event
>>> listener. It looks to me that the underlying execution thread follows the
>>> event-loop model, the special thing might be it is not only listening to
>>> the messages from broker, but also listening to the tasks from the user
>>> thread. This is essentially the thing a consumer has to do - interact with
>>> both server and user.
>>>
>>>>
>>>> If we want to move away from an event loop I'm not sure *any* aspect
>>>> of the current event loop style of api makes sense any more. I am not
>>>> totally married to event loops, but i do think what we have gives an
>>>> elegant way of implementing any higher level abstractions that would fully
>>>> implement the user's parallelism model. I don't want to go rethink
>>>> everything but I do think a half-way implementation that is event loop +
>>>> background threads is likely going to be icky.
>>>>
>>> We brought this up before to change the consumer.poll() to
>>> consumer.consume(). And did not do so simply because we wanted to less
>>> change in API... I might be crazy but can we think of the proposed model as
>>> processing thread + event-loop instead, rather than event-loop + background
>>> thread?
>>>
>>>>
>>>> WRT making it configurable whether liveness means "actually consuming"
>>>> or "background thread running" I would suggest that that is really the
>>>> worst outcome. These type of "modes" that are functionally totally
>>>> different are just awful from a documentation, testing, usability, etc pov.
>>>> I would strongly prefer we pick either of these, document it, and make it
>>>> work well rather than trying to do both.
>>>>
>>> Previously I thought this was the major benefit we wanted from a single
>>> threaded model, personally I don't have a strong preference on this. So I
>>> am OK with either way.
>>>
>>>>
>>>> -Jay
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jul 29, 2015 at 1:20 PM, Neha Narkhede <n...@confluent.io>
>>>> wrote:
>>>>
>>>>> Works now. Thanks Becket!
>>>>>
>>>>> On Wed, Jul 29, 2015 at 1:19 PM, Jiangjie Qin <j...@linkedin.com>
>>>>> wrote:
>>>>>
>>>>>> Ah... My bad, forgot to change the URL link for pictures.
>>>>>> Thanks for the quick response, Neha. It should be fixed now, can you
>>>>>> try again?
>>>>>>
>>>>>> Jiangjie (Becket) Qin
>>>>>>
>>>>>> On Wed, Jul 29, 2015 at 1:10 PM, Neha Narkhede <n...@confluent.io>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Becket. Quick comment - there seem to be a bunch of images
>>>>>>> that the wiki refers to, but none loaded for me. Just making sure if its
>>>>>>> just me or can everyone not see the pictures?
>>>>>>>
>>>>>>> On Wed, Jul 29, 2015 at 12:00 PM, Jiangjie Qin <j...@linkedin.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I agree with Ewen that a single threaded model will be tricky to
>>>>>>>> implement the same conventional semantic of async or Future. We just
>>>>>>>> drafted the following wiki which explains our thoughts in LinkedIn on 
>>>>>>>> the
>>>>>>>> new consumer API and threading model.
>>>>>>>>
>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/New+consumer+API+change+proposal
>>>>>>>>
>>>>>>>> We were trying to see:
>>>>>>>> 1. If we can use some kind of methodology to help us think about
>>>>>>>> what API we want to provide to user for different use cases.
>>>>>>>> 2. What is the pros and cons of current single threaded model. Is
>>>>>>>> there a way that we can maintain the benefits while solve the issues 
>>>>>>>> we are
>>>>>>>> facing now with single threaded model.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>
>>>>>>>> On Tue, Jul 28, 2015 at 10:28 PM, Ewen Cheslack-Postava <
>>>>>>>> e...@confluent.io> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jul 28, 2015 at 5:18 PM, Guozhang Wang <wangg...@gmail.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> I think Ewen has proposed these APIs for using callbacks along
>>>>>>>>>> with returning future in the commit calls, i.e. something similar to:
>>>>>>>>>>
>>>>>>>>>> public Future<void> commit(ConsumerCommitCallback callback);
>>>>>>>>>>
>>>>>>>>>> public Future<void> commit(Map<TopicPartition, Long> offsets,
>>>>>>>>>> ConsumerCommitCallback callback);
>>>>>>>>>>
>>>>>>>>>> At that time I was slightly intending not to include the Future
>>>>>>>>>> besides adding the callback mainly because of the implementation 
>>>>>>>>>> complexity
>>>>>>>>>> I feel it could introduce along with the retry settings after looking
>>>>>>>>>> through the code base. I would happy to change my mind if we could 
>>>>>>>>>> propose
>>>>>>>>>> a prototype implementation that is simple enough.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> One of the reasons that interface ended up being difficult (or
>>>>>>>>> maybe impossible) to make work reasonably is because the consumer was
>>>>>>>>> thread-safe at the time. That made it impossible to know what should 
>>>>>>>>> be
>>>>>>>>> done when Future.get() is called -- should the implementation call 
>>>>>>>>> poll()
>>>>>>>>> itself, or would the fact that the user is calling get() imply that 
>>>>>>>>> there's
>>>>>>>>> a background thread running the poll() loop and we just need to wait 
>>>>>>>>> for it?
>>>>>>>>>
>>>>>>>>> The consumer is no longer thread safe, but I think the same
>>>>>>>>> problem remains because the expectation with Futures is that they are
>>>>>>>>> thread safe. Which means that even if the consumer isn't thread safe, 
>>>>>>>>> I
>>>>>>>>> would expect to be able to hand that Future off to some other thread, 
>>>>>>>>> have
>>>>>>>>> the second thread call get(), and then continue driving the poll loop 
>>>>>>>>> in my
>>>>>>>>> thread (which in turn would eventually resolve the Future).
>>>>>>>>>
>>>>>>>>> I quite dislike the sync/async enum. While both operations commit
>>>>>>>>> offsets, their semantics are so different that overloading a single 
>>>>>>>>> method
>>>>>>>>> with both is messy. That said, I don't think we should consider this 
>>>>>>>>> an
>>>>>>>>> inconsistency wrt the new producer API's use of Future because the 
>>>>>>>>> two APIs
>>>>>>>>> have a much more fundamental difference that justifies it: they have
>>>>>>>>> completely different threading and execution models.
>>>>>>>>>
>>>>>>>>> I think a Future-based API only makes sense if you can guarantee
>>>>>>>>> the operations that Futures are waiting on will continue to make 
>>>>>>>>> progress
>>>>>>>>> regardless of what the thread using the Future does. The producer API 
>>>>>>>>> makes
>>>>>>>>> that work by processing asynchronous requests in a background thread. 
>>>>>>>>> The
>>>>>>>>> new consumer does not, and so it becomes difficult/impossible to 
>>>>>>>>> implement
>>>>>>>>> the Future correctly. (Or, you have to make assumptions which break 
>>>>>>>>> other
>>>>>>>>> use cases; if you want to support the simple use case of just making a
>>>>>>>>> commit() synchronous by calling get(), the Future has to call poll()
>>>>>>>>> internally; but if you do that, then if any user ever wants to add
>>>>>>>>> synchronization to the consumer via some external mechanism, then the
>>>>>>>>> implementation of the Future's get() method will not be subject to 
>>>>>>>>> that
>>>>>>>>> synchronization and things will break).
>>>>>>>>>
>>>>>>>>> -Ewen
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Guozhang
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 28, 2015 at 4:03 PM, Neha Narkhede <n...@confluent.io
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Hey Adi,
>>>>>>>>>>>
>>>>>>>>>>> When we designed the initial version, the producer API was still
>>>>>>>>>>> changing. I thought about adding the Future and then just didn't 
>>>>>>>>>>> get to it.
>>>>>>>>>>> I agree that we should look into adding it for consistency.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Neha
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 28, 2015 at 1:51 PM, Aditya Auradkar <
>>>>>>>>>>> aaurad...@linkedin.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Great discussion everyone!
>>>>>>>>>>>>
>>>>>>>>>>>> One general comment on the sync/async API's on the new consumer.
>>>>>>>>>>>>  I think the producer tackles sync vs async API's well. For
>>>>>>>>>>>> API's that can either be sync or async, can we simply return a 
>>>>>>>>>>>> future? That
>>>>>>>>>>>> seems more elegant for the API's that make sense either in both 
>>>>>>>>>>>> flavors.
>>>>>>>>>>>> From the users perspective, it is more consistent with the new 
>>>>>>>>>>>> producer.
>>>>>>>>>>>> One easy example is the commit call with the CommitType enum.. we 
>>>>>>>>>>>> can make
>>>>>>>>>>>> that call always async and users can block on the future if they 
>>>>>>>>>>>> want to
>>>>>>>>>>>> make sure their offsets are committed.
>>>>>>>>>>>>
>>>>>>>>>>>> Aditya
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jul 27, 2015 at 2:06 PM, Onur Karaman <
>>>>>>>>>>>> okara...@linkedin.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the great responses, everyone!
>>>>>>>>>>>>>
>>>>>>>>>>>>> To expand a tiny bit on my initial post: while I did bring up
>>>>>>>>>>>>> old high level consumers, the teams we spoke to were actually not 
>>>>>>>>>>>>> the types
>>>>>>>>>>>>> of services that simply wanted an easy way to get 
>>>>>>>>>>>>> ConsumerRecords. We spoke
>>>>>>>>>>>>> to infrastructure teams that I would consider to be closer to the
>>>>>>>>>>>>> "power-user" end of the spectrum and would want KafkaConsumer's 
>>>>>>>>>>>>> level of
>>>>>>>>>>>>> granularity. Some would use auto group management. Some would use 
>>>>>>>>>>>>> explicit
>>>>>>>>>>>>> group management. All of them would turn off auto offset commits. 
>>>>>>>>>>>>> Yes, the
>>>>>>>>>>>>> Samza team had prior experience with the old SimpleConsumer, but 
>>>>>>>>>>>>> this is
>>>>>>>>>>>>> the first kafka consumer being used by the Databus team. So I 
>>>>>>>>>>>>> don't really
>>>>>>>>>>>>> think the feedback received was about the simpler times or wanting
>>>>>>>>>>>>> additional higher-level clients.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Onur
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jul 27, 2015 at 1:41 PM, Jason Gustafson <
>>>>>>>>>>>>> ja...@confluent.io> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think if we recommend a longer session timeout, then we
>>>>>>>>>>>>>> should expose the heartbeat frequency in configuration since 
>>>>>>>>>>>>>> this generally
>>>>>>>>>>>>>> controls how long normal rebalances will take. I think it's 
>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>> hard-coded to 3 heartbeats per session timeout. It could also be 
>>>>>>>>>>>>>> nice to
>>>>>>>>>>>>>> have an explicit LeaveGroup request to implement clean shutdown 
>>>>>>>>>>>>>> of a
>>>>>>>>>>>>>> consumer. Then the coordinator doesn't have to wait for the 
>>>>>>>>>>>>>> timeout to
>>>>>>>>>>>>>> reassign partitions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Jason
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jul 27, 2015 at 1:25 PM, Jay Kreps <j...@confluent.io>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hey Kartik,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Totally agree we don't want people tuning timeouts in the
>>>>>>>>>>>>>>> common case.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However there are two ways to avoid this:
>>>>>>>>>>>>>>> 1. Default the timeout high
>>>>>>>>>>>>>>> 2. Put the heartbeat in a separate thread
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When we were doing the consumer design we discussed this
>>>>>>>>>>>>>>> tradeoff and I think the conclusion we came to was that 
>>>>>>>>>>>>>>> defaulting to a
>>>>>>>>>>>>>>> high timeout was actually better. This means it takes a little 
>>>>>>>>>>>>>>> longer to
>>>>>>>>>>>>>>> detect a failure, but usually that is not a big problem and 
>>>>>>>>>>>>>>> people who want
>>>>>>>>>>>>>>> faster failure detection can tune it down. This seemed better 
>>>>>>>>>>>>>>> than having
>>>>>>>>>>>>>>> the failure detection not really cover the consumption and just 
>>>>>>>>>>>>>>> be a
>>>>>>>>>>>>>>> background ping. The two reasons where (a) you still have the 
>>>>>>>>>>>>>>> GC problem
>>>>>>>>>>>>>>> even for the background thread, (b) consumption is in some 
>>>>>>>>>>>>>>> sense a better
>>>>>>>>>>>>>>> definition of an active healthy consumer and a lot of problems 
>>>>>>>>>>>>>>> crop up when
>>>>>>>>>>>>>>> you have an inactive consumer with an active background thread 
>>>>>>>>>>>>>>> (as today).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When we had the discussion I think what we realized was that
>>>>>>>>>>>>>>> most people who were worried about the timeout where imagining 
>>>>>>>>>>>>>>> a very low
>>>>>>>>>>>>>>> default (500ms) say. But in fact just setting this to 60 
>>>>>>>>>>>>>>> seconds or higher
>>>>>>>>>>>>>>> as a default would be okay, this adds to the failure detection 
>>>>>>>>>>>>>>> time but
>>>>>>>>>>>>>>> only apps that care about this need to tune. This should 
>>>>>>>>>>>>>>> largely eliminate
>>>>>>>>>>>>>>> false positives since after all if you disappear for 60 seconds 
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> actually starts to be more of a true positive, even if you come 
>>>>>>>>>>>>>>> back... :-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Jay
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Jul 27, 2015 at 1:05 PM, Kartik Paramasivam <
>>>>>>>>>>>>>>> kparamasi...@linkedin.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> adding the open source alias.  This email started off as a
>>>>>>>>>>>>>>>> broader discussion around the new consumer.  I was zooming 
>>>>>>>>>>>>>>>> into only the
>>>>>>>>>>>>>>>> aspect of poll() being the only mechanism for driving the 
>>>>>>>>>>>>>>>> heartbeats.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes the lag is the effect of the problem (not the
>>>>>>>>>>>>>>>> problem).  Monitoring the lag is important as it is the 
>>>>>>>>>>>>>>>> primary way to tell
>>>>>>>>>>>>>>>> if the application is wedged.  There might be other metrics 
>>>>>>>>>>>>>>>> which can
>>>>>>>>>>>>>>>> possibly capture the same essence. Yes the lag is at the 
>>>>>>>>>>>>>>>> consumer group
>>>>>>>>>>>>>>>> level, but you can tell that one of the consumers is messed up 
>>>>>>>>>>>>>>>> if one of
>>>>>>>>>>>>>>>> the partitions in the application start generating lag and 
>>>>>>>>>>>>>>>> others are good
>>>>>>>>>>>>>>>> for e.g.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Monitoring aside, I think the main point of concern is that
>>>>>>>>>>>>>>>> in the old consumer most customers don't have to worry about 
>>>>>>>>>>>>>>>> unnecessary
>>>>>>>>>>>>>>>> rebalances and most of the things that they do in their app 
>>>>>>>>>>>>>>>> doesn't have an
>>>>>>>>>>>>>>>> impact on the session timeout..  (i.e. the only thing that 
>>>>>>>>>>>>>>>> causes
>>>>>>>>>>>>>>>> rebalances is when the GC is out of whack).    For the handful 
>>>>>>>>>>>>>>>> of customers
>>>>>>>>>>>>>>>> who are impacted by GC related rebalances, i would imagine 
>>>>>>>>>>>>>>>> that all of them
>>>>>>>>>>>>>>>> would really want us to make the system more resilient.    I 
>>>>>>>>>>>>>>>> agree that the
>>>>>>>>>>>>>>>> GC problem can't be solved easily in the java client, however 
>>>>>>>>>>>>>>>> it appears
>>>>>>>>>>>>>>>> that now we would be expecting the consuming applications to 
>>>>>>>>>>>>>>>> be even more
>>>>>>>>>>>>>>>> careful with ongoing tuning of the timeouts.  At LinkedIn, we 
>>>>>>>>>>>>>>>> have seen
>>>>>>>>>>>>>>>> that most kafka applications don't have much of a clue about 
>>>>>>>>>>>>>>>> configuring
>>>>>>>>>>>>>>>> the timeouts and just end up calling the Kafka team when their 
>>>>>>>>>>>>>>>> application
>>>>>>>>>>>>>>>> sees rebalances.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The other side effect of poll driving the heartbeats is
>>>>>>>>>>>>>>>> that we have to make sure that people don't set a poll timeout 
>>>>>>>>>>>>>>>> that is
>>>>>>>>>>>>>>>> larger than the session timeout.   If we had a notion of 
>>>>>>>>>>>>>>>> implicit
>>>>>>>>>>>>>>>> heartbeats then we could also automatically make this work for 
>>>>>>>>>>>>>>>> consumers by
>>>>>>>>>>>>>>>> sending hearbeats at the appropriate interval even though the 
>>>>>>>>>>>>>>>> customers
>>>>>>>>>>>>>>>> want to do a long poll.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We could surely work around this in LinkedIn if either we
>>>>>>>>>>>>>>>> have the Pause() api or an explicit HeartBeat() api on the 
>>>>>>>>>>>>>>>> consumer.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Would love to hear how other people think about this
>>>>>>>>>>>>>>>> subject ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> Kartik
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Jul 25, 2015 at 7:41 PM, Neha Narkhede <
>>>>>>>>>>>>>>>> n...@confluent.io> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Agree with the dilemma you are pointing out, which is that
>>>>>>>>>>>>>>>>> there are many ways the application's message processing 
>>>>>>>>>>>>>>>>> could fail and we
>>>>>>>>>>>>>>>>> wouldn't be able to model all of those in the consumer's 
>>>>>>>>>>>>>>>>> failure detection
>>>>>>>>>>>>>>>>> mechanism. So we should try to model as much of it as we can 
>>>>>>>>>>>>>>>>> so the
>>>>>>>>>>>>>>>>> consumer's failure detection is meaningful.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Point being that the only absolute way to really detect
>>>>>>>>>>>>>>>>>> that an app is healthy is to monitor lag. If the lag 
>>>>>>>>>>>>>>>>>> increases then for
>>>>>>>>>>>>>>>>>> sure something is wrong.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The lag is merely the effect of the problem, not the
>>>>>>>>>>>>>>>>> problem itself. Lag is also a consumer group level concept 
>>>>>>>>>>>>>>>>> and the problem
>>>>>>>>>>>>>>>>> we have is being able to detect failures at the level of 
>>>>>>>>>>>>>>>>> individual
>>>>>>>>>>>>>>>>> consumer instances.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As you pointed out, a consumer that poll() is a stronger
>>>>>>>>>>>>>>>>> indicator of whether the consumer is alive or not. The 
>>>>>>>>>>>>>>>>> dilemma then is who
>>>>>>>>>>>>>>>>> defines what a healthy poll() frequency is. No one else but 
>>>>>>>>>>>>>>>>> the application
>>>>>>>>>>>>>>>>> owner can define what a "normal" processing latency is for 
>>>>>>>>>>>>>>>>> their
>>>>>>>>>>>>>>>>> application. Now the question is what's the easiest way for 
>>>>>>>>>>>>>>>>> the user to
>>>>>>>>>>>>>>>>> define this without having to tune and fine tune this too 
>>>>>>>>>>>>>>>>> often. The
>>>>>>>>>>>>>>>>> heartbeat interval certainly does not have to be *exactly*
>>>>>>>>>>>>>>>>> 99tile of processing latency but could be in the ballpark + 
>>>>>>>>>>>>>>>>> an error delta.
>>>>>>>>>>>>>>>>> The error delta is the application owner's acceptable risk 
>>>>>>>>>>>>>>>>> threshold during
>>>>>>>>>>>>>>>>> which they would be ok if the application remains part of the 
>>>>>>>>>>>>>>>>> group despite
>>>>>>>>>>>>>>>>> being dead. It is ultimately a tradeoff between operational 
>>>>>>>>>>>>>>>>> ease and more
>>>>>>>>>>>>>>>>> accurate failure detection.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> With quotas the write latencies to kafka could range from
>>>>>>>>>>>>>>>>>> a few milliseconds all the way to a tens of seconds.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This is actually no different from the GC problem. Most
>>>>>>>>>>>>>>>>> most of the times, the normal GC falls in the few ms range 
>>>>>>>>>>>>>>>>> and there are
>>>>>>>>>>>>>>>>> many applications even at LinkedIn for which the max GC falls 
>>>>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>>>>> multiple seconds range. Note that it also can't be predicted, 
>>>>>>>>>>>>>>>>> so has to be
>>>>>>>>>>>>>>>>> an observed value. One way or the other, you have to observe 
>>>>>>>>>>>>>>>>> what this
>>>>>>>>>>>>>>>>> acceptable "max" is for your application and then set the 
>>>>>>>>>>>>>>>>> appropriate
>>>>>>>>>>>>>>>>> timeouts.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Since this is not something that can be automated, this is
>>>>>>>>>>>>>>>>> a config that the application owner has to set based on the 
>>>>>>>>>>>>>>>>> expected
>>>>>>>>>>>>>>>>> behavior of their application. Not wanting to do that leads 
>>>>>>>>>>>>>>>>> to ending up
>>>>>>>>>>>>>>>>> with bad consumption semantics where the application process 
>>>>>>>>>>>>>>>>> continues to
>>>>>>>>>>>>>>>>> be part of a group owning partitions but not consuming since 
>>>>>>>>>>>>>>>>> it has halted
>>>>>>>>>>>>>>>>> due to a problem. The fact that the design requires them to 
>>>>>>>>>>>>>>>>> express that in
>>>>>>>>>>>>>>>>> poll() frequency or not doesn't change the fact that the 
>>>>>>>>>>>>>>>>> application owner
>>>>>>>>>>>>>>>>> has to go through the process of measuring and then defining 
>>>>>>>>>>>>>>>>> this "max".
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The reverse where they don't do this and the application
>>>>>>>>>>>>>>>>> remains in the group despite being dead is super confusing 
>>>>>>>>>>>>>>>>> and frustrating
>>>>>>>>>>>>>>>>> too. So the due diligence up front is actually worth. And as 
>>>>>>>>>>>>>>>>> long as the
>>>>>>>>>>>>>>>>> poll() latency and processing latency can be monitored, it 
>>>>>>>>>>>>>>>>> should be easy
>>>>>>>>>>>>>>>>> to tell the reason for a rebalance, whether that is valid or 
>>>>>>>>>>>>>>>>> not and how
>>>>>>>>>>>>>>>>> that should be tuned.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As for the wrapper, KIP-28 is the wrapper in open source
>>>>>>>>>>>>>>>>> that will hide this complexity and I agree that LI is 
>>>>>>>>>>>>>>>>> unblocked since you
>>>>>>>>>>>>>>>>> can do this in TrackerConsumer in the meantime.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Neha
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Jul 25, 2015 at 4:30 PM, Kartik Paramasivam <
>>>>>>>>>>>>>>>>> kparamasi...@linkedin.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For commit(), I think it should hopefully be an easier
>>>>>>>>>>>>>>>>>> discussion, so maybe we can follow up when we meet up next.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> As far as the heartbeat is concerned, I think the points
>>>>>>>>>>>>>>>>>> you discuss are all very valid.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> GC pauses impacting the heartbeats is a real issue.
>>>>>>>>>>>>>>>>>> However there are a smaller percentage of memory hungry apps 
>>>>>>>>>>>>>>>>>> that get hit
>>>>>>>>>>>>>>>>>> by it.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The broader issue whereby even if the heartbeats are
>>>>>>>>>>>>>>>>>> healthy, the app might not be behaving correctly is also 
>>>>>>>>>>>>>>>>>> real.  If the app
>>>>>>>>>>>>>>>>>> is calling poll() then the probability that the app is 
>>>>>>>>>>>>>>>>>> healthy is surely
>>>>>>>>>>>>>>>>>> higher.  But this again isn't an absolute measure that the 
>>>>>>>>>>>>>>>>>> app is
>>>>>>>>>>>>>>>>>> processing correctly.
>>>>>>>>>>>>>>>>>> In other cases the app might have even died in which case
>>>>>>>>>>>>>>>>>> this discussion is moot.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Point being that the only absolute way to really detect
>>>>>>>>>>>>>>>>>> that an app is healthy is to monitor lag. If the lag 
>>>>>>>>>>>>>>>>>> increases then for
>>>>>>>>>>>>>>>>>> sure something is wrong.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The proposal seems to be that the application needs to
>>>>>>>>>>>>>>>>>> tune their session timeout based on the 99tile of the time 
>>>>>>>>>>>>>>>>>> they take to
>>>>>>>>>>>>>>>>>> process events after every poll.   This turns out is a 
>>>>>>>>>>>>>>>>>> nontrivial thing to
>>>>>>>>>>>>>>>>>> do for an application todo. To start with when an 
>>>>>>>>>>>>>>>>>> application is new their
>>>>>>>>>>>>>>>>>> data is going to be based on tests that they have done on 
>>>>>>>>>>>>>>>>>> synthetic data.
>>>>>>>>>>>>>>>>>> This often times doesn't represent what they will see in 
>>>>>>>>>>>>>>>>>> production.  Once
>>>>>>>>>>>>>>>>>> the app is in production their processing latencies will 
>>>>>>>>>>>>>>>>>> potentially vary
>>>>>>>>>>>>>>>>>> over time.  It is extremely unlikely that the application 
>>>>>>>>>>>>>>>>>> owner does a
>>>>>>>>>>>>>>>>>> careful job of monitoring the 99tile of latencies over time 
>>>>>>>>>>>>>>>>>> and readjust
>>>>>>>>>>>>>>>>>> the settings.  Often times the latencies vary because of 
>>>>>>>>>>>>>>>>>> variance is other
>>>>>>>>>>>>>>>>>> services that are called by the consumer as part of 
>>>>>>>>>>>>>>>>>> processing the events.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Case in point would be a simple app which reads events
>>>>>>>>>>>>>>>>>> and writes to Kafka.  With quotas the write latencies to 
>>>>>>>>>>>>>>>>>> kafka could range
>>>>>>>>>>>>>>>>>> from a few milliseconds all the way to a tens of seconds.  
>>>>>>>>>>>>>>>>>> As the scale of
>>>>>>>>>>>>>>>>>> processing for an app increasing the app or that 'user' 
>>>>>>>>>>>>>>>>>> could now get
>>>>>>>>>>>>>>>>>> quotaed.  Instead of slowing down gracefully unless the 
>>>>>>>>>>>>>>>>>> application owner
>>>>>>>>>>>>>>>>>> has carefully tuned the timeout, now we are looking at a 
>>>>>>>>>>>>>>>>>> potential outage
>>>>>>>>>>>>>>>>>> where the app could get hit by constant rebalances.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If we expose the pause() Api then It is possible for us
>>>>>>>>>>>>>>>>>> to take care of this in the linkedin wrapper.  Whereby we 
>>>>>>>>>>>>>>>>>> would keep
>>>>>>>>>>>>>>>>>> calling poll on a separate thread periodically and enqueue 
>>>>>>>>>>>>>>>>>> the messages.
>>>>>>>>>>>>>>>>>> When the queue is full we would call pause().
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In essence we can work around it in LinkedIn, however I
>>>>>>>>>>>>>>>>>> think it is vastly better if we address this in the Api as 
>>>>>>>>>>>>>>>>>> every major
>>>>>>>>>>>>>>>>>> customer will eventually be pained by it.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kartik
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Jul 24, 2015, at 10:08 PM, Jay Kreps <j...@confluent.io>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hey guys,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Happy to discuss. I agree there may be some rough edges
>>>>>>>>>>>>>>>>>> and now is definitely the time to clean them up.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm pretty reluctant to change the threading model or
>>>>>>>>>>>>>>>>>> undergo a big api redesign at this point beyond the group 
>>>>>>>>>>>>>>>>>> management stuff
>>>>>>>>>>>>>>>>>> we've discussed in the context of Samza/copycat which is 
>>>>>>>>>>>>>>>>>> already a big
>>>>>>>>>>>>>>>>>> effort.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Overall I agree that we have done a poor job of
>>>>>>>>>>>>>>>>>> documenting which apis block and which don't and when people 
>>>>>>>>>>>>>>>>>> are surprised
>>>>>>>>>>>>>>>>>> because we haven't labeled something that will be 
>>>>>>>>>>>>>>>>>> unintuitive. But the
>>>>>>>>>>>>>>>>>> overall style of poll/select-based apis is quite common in 
>>>>>>>>>>>>>>>>>> programming
>>>>>>>>>>>>>>>>>> going back to unix select so I don't think it's beyond 
>>>>>>>>>>>>>>>>>> people if explained
>>>>>>>>>>>>>>>>>> well (after all we need to mix sync and async apis and if we 
>>>>>>>>>>>>>>>>>> don't say
>>>>>>>>>>>>>>>>>> which is which any scheme will be confusing).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For what it's worth the experience with this api has
>>>>>>>>>>>>>>>>>> actually been about 1000x better than the issues people had 
>>>>>>>>>>>>>>>>>> around
>>>>>>>>>>>>>>>>>> intuitiveness with the high-level api. The crazy blocking 
>>>>>>>>>>>>>>>>>> iterator,
>>>>>>>>>>>>>>>>>> impossible internal queue sizing, baroque threading model, 
>>>>>>>>>>>>>>>>>> etc  have all
>>>>>>>>>>>>>>>>>> caused endless amounts of anger. Not to mention that that 
>>>>>>>>>>>>>>>>>> client
>>>>>>>>>>>>>>>>>> effectively disqualifies about 50% of the use cases people 
>>>>>>>>>>>>>>>>>> want to try to
>>>>>>>>>>>>>>>>>> use it for (plus I regularly hear people tell me they've 
>>>>>>>>>>>>>>>>>> heard not to use
>>>>>>>>>>>>>>>>>> it at all for various reasons ranging from data loss to lack 
>>>>>>>>>>>>>>>>>> of features).
>>>>>>>>>>>>>>>>>> It's important to have that context when people need to 
>>>>>>>>>>>>>>>>>> switch and they say
>>>>>>>>>>>>>>>>>> "oh the old way was so simple and the new way complex!" :-)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Let me give some context related to your points, based on
>>>>>>>>>>>>>>>>>> our previous discussions:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For commit, let's discuss, that is easy either way.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The motivation for avoiding additional threading was
>>>>>>>>>>>>>>>>>> two-fold. First this client is really intended to be the 
>>>>>>>>>>>>>>>>>> lowest level
>>>>>>>>>>>>>>>>>> client. There are many, many possible higher level 
>>>>>>>>>>>>>>>>>> processing abstractions.
>>>>>>>>>>>>>>>>>> One thing we found to be a big problem with the high-level 
>>>>>>>>>>>>>>>>>> client was that
>>>>>>>>>>>>>>>>>> it coupled things everyone must have--failover, etc--with 
>>>>>>>>>>>>>>>>>> things that are
>>>>>>>>>>>>>>>>>> different in each use case like the appropriate threading 
>>>>>>>>>>>>>>>>>> model. If you do
>>>>>>>>>>>>>>>>>> this you need to also maintain a thread free low-level 
>>>>>>>>>>>>>>>>>> consumer api for
>>>>>>>>>>>>>>>>>> people to get around whatever you have done.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The second reason was that the internal threading in the
>>>>>>>>>>>>>>>>>> client became quite complex. The answer with threading is 
>>>>>>>>>>>>>>>>>> always that "it
>>>>>>>>>>>>>>>>>> won't be complex this time", but it always is.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For the heartbeat you correctly describe the downside to
>>>>>>>>>>>>>>>>>> coupling heartbeat with poll--the contract is that the 
>>>>>>>>>>>>>>>>>> application must
>>>>>>>>>>>>>>>>>> regularly consume to be considered an active consumer. This 
>>>>>>>>>>>>>>>>>> allows the
>>>>>>>>>>>>>>>>>> possibility of false positive failure detections. However 
>>>>>>>>>>>>>>>>>> it's important to
>>>>>>>>>>>>>>>>>> understand the downside of the alternative. If you do 
>>>>>>>>>>>>>>>>>> background polling a
>>>>>>>>>>>>>>>>>> consumer is considered active as long as it isn't shutdown. 
>>>>>>>>>>>>>>>>>> This leads to
>>>>>>>>>>>>>>>>>> all kinds of active consumers that aren't consuming because 
>>>>>>>>>>>>>>>>>> they have
>>>>>>>>>>>>>>>>>> leaked or otherwise stopped but are still claiming 
>>>>>>>>>>>>>>>>>> partitions and
>>>>>>>>>>>>>>>>>> heart-beating. This failure mode is actually far far worse. 
>>>>>>>>>>>>>>>>>> If you allow
>>>>>>>>>>>>>>>>>> false positives the user sees the frequent rebalances and 
>>>>>>>>>>>>>>>>>> knows they aren't
>>>>>>>>>>>>>>>>>> consuming frequently enough to be considered active but if 
>>>>>>>>>>>>>>>>>> you allows false
>>>>>>>>>>>>>>>>>> negatives you end up having weeks go by before someone 
>>>>>>>>>>>>>>>>>> notices that a
>>>>>>>>>>>>>>>>>> partition has been unconsumed the whole time at which point 
>>>>>>>>>>>>>>>>>> the data is
>>>>>>>>>>>>>>>>>> gone. Plus of course even if you do this you still have 
>>>>>>>>>>>>>>>>>> regular false
>>>>>>>>>>>>>>>>>> positives anyway from GC pauses (as now). We discussed this 
>>>>>>>>>>>>>>>>>> in some depth
>>>>>>>>>>>>>>>>>> at the time and decided that it is better to have the 
>>>>>>>>>>>>>>>>>> liveness notion tied
>>>>>>>>>>>>>>>>>> to *actual* consumption which is the actual definition
>>>>>>>>>>>>>>>>>> of liveness.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Jay
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Jul 24, 2015 at 5:35 PM, Onur Karaman <
>>>>>>>>>>>>>>>>>> okara...@linkedin.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi Confluent Team.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> There has recently been a lot of open source activity
>>>>>>>>>>>>>>>>>>> regarding the new KafkaConsumer:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-2123
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-2350
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-2359
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> http://mail-archives.apache.org/mod_mbox/kafka-users/201507.mbox/%3ccaauywg_pwbs3hsevnp5rccmpvqbaamap+zgn8fh+woelvt_...@mail.gmail.com%3E
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> We’ve explained the KafkaConsumer API to the Databus,
>>>>>>>>>>>>>>>>>>> Samza, and some other teams and we got similar feedback.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> To summarize the feedback we received from other teams:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    1.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    The current behavior is not intuitive. For example,
>>>>>>>>>>>>>>>>>>>    KafkaConsumer.poll drives everything. The other methods 
>>>>>>>>>>>>>>>>>>> like subscribe,
>>>>>>>>>>>>>>>>>>>    unsubscribe, seek, commit(async) don’t do anything 
>>>>>>>>>>>>>>>>>>> without a
>>>>>>>>>>>>>>>>>>>    KafkaConsumer.poll call.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    1.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    The semantics of a commit() call should be
>>>>>>>>>>>>>>>>>>>    consistent between sync and async operations. Currently, 
>>>>>>>>>>>>>>>>>>> sync commit is a
>>>>>>>>>>>>>>>>>>>    blocking call which actually sends out an 
>>>>>>>>>>>>>>>>>>> OffsetCommitRequest and waits for
>>>>>>>>>>>>>>>>>>>    the response upon the user’s KafkaConsumer.commit call. 
>>>>>>>>>>>>>>>>>>> However, the async
>>>>>>>>>>>>>>>>>>>    commit is a nonblocking call which just queues up the 
>>>>>>>>>>>>>>>>>>> OffsetCommitRequest.
>>>>>>>>>>>>>>>>>>>    The request itself is later sent out in the next poll. 
>>>>>>>>>>>>>>>>>>> The teams we talked
>>>>>>>>>>>>>>>>>>>    to found this misleading.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    1.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    Heartbeats are dependent on user application
>>>>>>>>>>>>>>>>>>>    behavior (i.e. user applications calling poll). This can 
>>>>>>>>>>>>>>>>>>> be a big problem
>>>>>>>>>>>>>>>>>>>    as we don’t control how different applications behave. 
>>>>>>>>>>>>>>>>>>> For example, we
>>>>>>>>>>>>>>>>>>>    might have an application which reads from Kafka and 
>>>>>>>>>>>>>>>>>>> writes to Espresso. If
>>>>>>>>>>>>>>>>>>>    Espresso is slow for whatever reason, then in rebalances 
>>>>>>>>>>>>>>>>>>> could happen.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Generally speaking, we feel that the current
>>>>>>>>>>>>>>>>>>> KafkaConsumer API design is more of a wrapping around the 
>>>>>>>>>>>>>>>>>>> old simple
>>>>>>>>>>>>>>>>>>> consumer, i.e. in old consumer we ask users to deal with 
>>>>>>>>>>>>>>>>>>> raw protocols and
>>>>>>>>>>>>>>>>>>> error handlings while in KafkaConsumer we do that for 
>>>>>>>>>>>>>>>>>>> users. However, for
>>>>>>>>>>>>>>>>>>> old high level consumer users (which are the majority of 
>>>>>>>>>>>>>>>>>>> users), the
>>>>>>>>>>>>>>>>>>> experience is a noticeable regression. The old high level 
>>>>>>>>>>>>>>>>>>> consumer
>>>>>>>>>>>>>>>>>>> interface is simple and easy to use for end user, while 
>>>>>>>>>>>>>>>>>>> KafkaConsumer
>>>>>>>>>>>>>>>>>>> requires users to be aware of many underlying details and 
>>>>>>>>>>>>>>>>>>> is becoming
>>>>>>>>>>>>>>>>>>> prohibitive for users to adopt. This is hinted by the 
>>>>>>>>>>>>>>>>>>> javadoc growing
>>>>>>>>>>>>>>>>>>> bigger and bigger.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> We think it's getting to the point where we should take
>>>>>>>>>>>>>>>>>>> a step back and look at the big picture.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The current state of KafkaConsumer is that it's
>>>>>>>>>>>>>>>>>>> single-threaded. There's one big KafkaConsumer.poll called 
>>>>>>>>>>>>>>>>>>> by the user
>>>>>>>>>>>>>>>>>>> which pretty much drives everything:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - data fetches
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - heartbeats
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - join groups (new consumer joining a group, topic
>>>>>>>>>>>>>>>>>>> subscription changes, reacting to group rebalance)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - async offset commits
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - executing callbacks
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Given that the selector's poll is being driven by the
>>>>>>>>>>>>>>>>>>> end user, this ends up making us educate users on NIO and 
>>>>>>>>>>>>>>>>>>> the consequences
>>>>>>>>>>>>>>>>>>> of not calling KafkaConsumer.poll frequently enough:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - Coordinator will mark the consumer dead
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - async commits won't send
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - callbacks won't fire
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> More generally speaking, there are many surprises with
>>>>>>>>>>>>>>>>>>> the current KafkaConsumer implementation.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Here's what we consider to be the goals of KafkaConsumer:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - NIO
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - ability to commit, manipulate offsets, and consume
>>>>>>>>>>>>>>>>>>> messages
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - a way to subscribe to topics(auto group management) or
>>>>>>>>>>>>>>>>>>> partitions(explicit group management)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - no surprises in the user experience
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The last point is the big one that we think we aren't
>>>>>>>>>>>>>>>>>>> hitting. We think the most important example is that there 
>>>>>>>>>>>>>>>>>>> should be no
>>>>>>>>>>>>>>>>>>> requirement from the end user to consistently 
>>>>>>>>>>>>>>>>>>> KafkaConsumer.poll in order
>>>>>>>>>>>>>>>>>>> for all of the above tasks to happen. We think it would be 
>>>>>>>>>>>>>>>>>>> better to split
>>>>>>>>>>>>>>>>>>> those tasks into tasks that should not rely on 
>>>>>>>>>>>>>>>>>>> KafkaConsumer.poll and tasks
>>>>>>>>>>>>>>>>>>> that should rely on KafkaConsumer.poll.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Tasks that should not rely on KafkaConsumer.poll:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - heartbeats
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - join groups
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - commits
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - executing callbacks
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Only data fetches should rely on KafkaConsumer.poll
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This would help reduce the amount of surprises to the
>>>>>>>>>>>>>>>>>>> end user.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> We’ve sketched out a proposal and we’ll send it out to
>>>>>>>>>>>>>>>>>>> you guys early next week. We’d like to meet up with you at 
>>>>>>>>>>>>>>>>>>> LinkedIn on *July
>>>>>>>>>>>>>>>>>>> 31, 2015* so we can talk about it before proposing it
>>>>>>>>>>>>>>>>>>> to open source.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>> LinkedIn Kafka Dev Team
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Neha
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Neha
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> -- Guozhang
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thanks,
>>>>>>>>> Ewen
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>> Neha
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Neha
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Kafka Consumer thoughts

Reply via email to