Hi Jay,

On Tue, Nov 21, 2017, at 19:03, Jay Kreps wrote:
> I think the general thrust of this makes a ton of sense.
> 
> I don't love that we're introducing a second type of fetch request. I
> think the motivation is for compatibility, right? But isn't that what
> versioning s for? Basically to me although the modification we're making makes
> sense, the resulting protocol doesn't really seem like something you would
> design this way from scratch.

I think there are two big reasons to consider separating
IncrementalFetchRequest from FetchRequest.

As you say, the first reason is compatibility.  We will have to support
the full FetchRequest for a long time to come because of our
compatibility policy.  It would be good from a code quality point of
view to avoid having widely diverging code paths for different versions
of this request.

The other reason is that conceptually I feel that there should be both
full and incremental fetch requests.  This is similar to how HDFS has
both incremental and full block reports.  The full reports are necessary
when a node is restarted.  In HDFS, they also serve a periodic sanity
check if the DataNode's view of what blocks exist has become
desynchronized from the NameNode's view.  While in theory you could
avoid the sanity check, in practice it often was important.

Also, just to be clear, I don't think we should convert KafkaConsumer to
using incremental fetch requests.  It seems inadvisable to allocate
broker memory for each KafkaConsumer.  After all, there can be quite a
few consumers, and we don't know ahead of time how many there will be. 
This is very different than brokers, where there are a small,
more-or-less constant, number.  Also, consumers tend not to consume from
a massive number of topics all at once, so I don't think they have the
same problems with the existing FetchRequest RPC as followers do.

> 
> I think I may be misunderstanding the semantics of the partitions in
> IncrementalFetchRequest. I think the intention is that the server
> remembers the partitions you last requested, and the partitions you specify
> in the request are added to this set. This is a bit odd though because you can
> add partitions but I don't see how you remove them, so it doesn't really let
> you fully make changes incrementally. I suspect I'm misunderstanding that
> somehow, though.

Sorry, I may have done a poor job explaining the proposal.  The
intention is that you cannot change the set of partitions you are
receiving information about except by making a full FetchRequest.  If
you need to make any changes to the watch set whatsoever, you must make
a full request, not an incremental.  The idea is that changes are very
infrequent, so we don't need to optimize this at the moment.

> You'd also need to be a little bit careful that there was
> no way for the server's idea of what the client is interested in and the
> client's idea to ever diverge as you made these modifications over time
> (due to bugs or whatever).
> 
> It seems like an alternative would be to not add a second request, but
> instead change the fetch api and implementation
> 
>    1. We save the partitions you last fetched on that connection in the
>    session for the connection (as I think you are proposing)
>    2. It only gives you back info on partitions that have data or have
>    changed (no reason you need the others, right?)
>    3. Not specifying any partitions means "give me the usual", as defined
>    by whatever you requested before attached to the session.
> 
> This would be a new version of the fetch API, so compatibility would be
> retained by retaining the older version as is.
> 
> This seems conceptually simpler to me. It's true that you have to resend
> the full set whenever you want to change it, but that actually seems less
> error prone and that should be rare.
> 
> I suspect you guys thought about this and it doesn't quite work, but
> maybe you could explain why?

I think your proposal is actually closer to what I was intending than
you thought.  Like I said above, I believe watch-set-change operations
should require a full fetch request.  It is certainly simpler to
implement and understand.

If I understand your proposal correctly, you are suggesting that the
existing FetchRequest RPC should be able to do double duty as either a
full or an incremental request?

best,
Colin

> 
> -Jay
> 
> On Tue, Nov 21, 2017 at 1:02 PM, Colin McCabe <cmcc...@apache.org> wrote:
> 
> > Hi all,
> >
> > I created a KIP to improve the scalability and latency of FetchRequest:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 227%3A+Introduce+Incremental+FetchRequests+to+Increase+
> > Partition+Scalability
> >
> > Please take a look.
> >
> > cheers,
> > Colin
> >

Reply via email to