Re: [DISCUSS] KIP-227: Introduce Incremental FetchRequests to Increase Partition Scalability

Jay Kreps Wed, 22 Nov 2017 13:09:03 -0800

Okay yeah, what I said didn't really work or make sense. Ismael's
interpretation is better.


Couple of things to point out:

   1. I'm less sure that replication has a high partition count and
   consumers don't. There are definitely use cases for consumers that
   subscribe to everything (e.g. load all my data into HDFS) as well as super
   high partition count topics. In a bigger cluster it is unlikely a given
   node is actually replicating that many partitions from another particular
   node (though perhaps in aggregate the effect is the same). I think it would
   clearly be desirable to have a solution that targeted both the consumer and
   replication if that were achievable. I agree with the concern on memory,
   but perhaps there could be a way to be smart about the memory usage?
   2. For the question of one request vs two, one difference in values here
   may be that it sounds like you are proposing a less ideal protocol to
   simplify the broker code. To me the protocol is really *the* fundamental
   interface in Kafka and we should really strive to make that something that
   is beautiful and makes sense on its own (without needing to understand the
   history of how we got there). I think there may well be such an explanation
   for the two API version (as you kind of said with your HDFS analogy) but
   really making it clear how these two APIs are different and how they
   interact is key. Like, basically I think we should be able to explain it
   from scratch in such a way that it is obvious you'd have these two things
   as the fundamental primitives for fetching data.

-Jay

On Wed, Nov 22, 2017 at 11:02 AM, Colin McCabe <[email protected]> wrote:

> Hi Jay,
>
> On Tue, Nov 21, 2017, at 19:03, Jay Kreps wrote:
> > I think the general thrust of this makes a ton of sense.
> >
> > I don't love that we're introducing a second type of fetch request. I
> > think the motivation is for compatibility, right? But isn't that what
> > versioning s for? Basically to me although the modification we're making
> makes
> > sense, the resulting protocol doesn't really seem like something you
> would
> > design this way from scratch.
>
> I think there are two big reasons to consider separating
> IncrementalFetchRequest from FetchRequest.
>
> As you say, the first reason is compatibility.  We will have to support
> the full FetchRequest for a long time to come because of our
> compatibility policy.  It would be good from a code quality point of
> view to avoid having widely diverging code paths for different versions
> of this request.
>
> The other reason is that conceptually I feel that there should be both
> full and incremental fetch requests.  This is similar to how HDFS has
> both incremental and full block reports.  The full reports are necessary
> when a node is restarted.  In HDFS, they also serve a periodic sanity
> check if the DataNode's view of what blocks exist has become
> desynchronized from the NameNode's view.  While in theory you could
> avoid the sanity check, in practice it often was important.
>
> Also, just to be clear, I don't think we should convert KafkaConsumer to
> using incremental fetch requests.  It seems inadvisable to allocate
> broker memory for each KafkaConsumer.  After all, there can be quite a
> few consumers, and we don't know ahead of time how many there will be.
> This is very different than brokers, where there are a small,
> more-or-less constant, number.  Also, consumers tend not to consume from
> a massive number of topics all at once, so I don't think they have the
> same problems with the existing FetchRequest RPC as followers do.
>
> >
> > I think I may be misunderstanding the semantics of the partitions in
> > IncrementalFetchRequest. I think the intention is that the server
> > remembers the partitions you last requested, and the partitions you
> specify
> > in the request are added to this set. This is a bit odd though because
> you can
> > add partitions but I don't see how you remove them, so it doesn't really
> let
> > you fully make changes incrementally. I suspect I'm misunderstanding that
> > somehow, though.
>
> Sorry, I may have done a poor job explaining the proposal.  The
> intention is that you cannot change the set of partitions you are
> receiving information about except by making a full FetchRequest.  If
> you need to make any changes to the watch set whatsoever, you must make
> a full request, not an incremental.  The idea is that changes are very
> infrequent, so we don't need to optimize this at the moment.
>
> > You'd also need to be a little bit careful that there was
> > no way for the server's idea of what the client is interested in and the
> > client's idea to ever diverge as you made these modifications over time
> > (due to bugs or whatever).
> >
> > It seems like an alternative would be to not add a second request, but
> > instead change the fetch api and implementation
> >
> >    1. We save the partitions you last fetched on that connection in the
> >    session for the connection (as I think you are proposing)
> >    2. It only gives you back info on partitions that have data or have
> >    changed (no reason you need the others, right?)
> >    3. Not specifying any partitions means "give me the usual", as defined
> >    by whatever you requested before attached to the session.
> >
> > This would be a new version of the fetch API, so compatibility would be
> > retained by retaining the older version as is.
> >
> > This seems conceptually simpler to me. It's true that you have to resend
> > the full set whenever you want to change it, but that actually seems less
> > error prone and that should be rare.
> >
> > I suspect you guys thought about this and it doesn't quite work, but
> > maybe you could explain why?
>
> I think your proposal is actually closer to what I was intending than
> you thought.  Like I said above, I believe watch-set-change operations
> should require a full fetch request.  It is certainly simpler to
> implement and understand.
>
> If I understand your proposal correctly, you are suggesting that the
> existing FetchRequest RPC should be able to do double duty as either a
> full or an incremental request?
>
> best,
> Colin
>
> >
> > -Jay
> >
> > On Tue, Nov 21, 2017 at 1:02 PM, Colin McCabe <[email protected]>
> wrote:
> >
> > > Hi all,
> > >
> > > I created a KIP to improve the scalability and latency of FetchRequest:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 227%3A+Introduce+Incremental+FetchRequests+to+Increase+
> > > Partition+Scalability
> > >
> > > Please take a look.
> > >
> > > cheers,
> > > Colin
> > >
>

Re: [DISCUSS] KIP-227: Introduce Incremental FetchRequests to Increase Partition Scalability

Reply via email to