I updated the KIP with the ideas we've been discussing. best, Colin
On Tue, Nov 28, 2017, at 08:38, Colin McCabe wrote: > On Mon, Nov 27, 2017, at 22:30, Jan Filipiak wrote: > > Hi Colin, thank you for this KIP, it can become a really useful thing. > > > > I just scanned through the discussion so far and wanted to start a > > thread to make as decision about keeping the > > cache with the Connection / Session or having some sort of UUID indN exed > > global Map. > > > > Sorry if that has been settled already and I missed it. In this case > > could anyone point me to the discussion? > > Hi Jan, > > I don't think anyone has discussed the idea of tying the cache to an > individual TCP session yet. I agree that since the cache is intended to > be used only by a single follower or client, it's an interesting thing > to think about. > > I guess the obvious disadvantage is that whenever your TCP session > drops, you have to make a full fetch request rather than an incremental > one. It's not clear to me how often this happens in practice -- it > probably depends a lot on the quality of the network. From a code > perspective, it might also be a bit difficult to access data associated > with the Session from classes like KafkaApis (although we could refactor > it to make this easier). > > It's also clear that even if we tie the cache to the session, we still > have to have limits on the number of caches we're willing to create. > And probably we should reserve some cache slots for each follower, so > that clients don't take all of them. > > > > > Id rather see a protocol in which the client is hinting the broker that, > > he is going to use the feature instead of a client > > realizing that the broker just offered the feature (regardless of > > protocol version which should only indicate that the feature > > would be usable). > > Hmm. I'm not sure what you mean by "hinting." I do think that the > server should have the option of not accepting incremental requests from > specific clients, in order to save memory space. > > > This seems to work better with a per > > connection/session attached Metadata than with a Map and could allow for > > easier client implementations. > > It would also make Client-side code easier as there wouldn't be any > > Cache-miss error Messages to handle. > > It is nice not to have to handle cache-miss responses, I agree. > However, TCP sessions aren't exposed to most of our client-side code. > For example, when the Producer creates a message and hands it off to the > NetworkClient, the NC will transparently re-connect and re-send a > message if the first send failed. The higher-level code will not be > informed about whether the TCP session was re-established, whether an > existing TCP session was used, and so on. So overall I would still lean > towards not coupling this to the TCP session... > > best, > Colin > > > > > Thank you again for the KIP. And again, if this was clarified already > > please drop me a hint where I could read about it. > > > > Best Jan > > > > > > > > > > > > On 21.11.2017 22:02, Colin McCabe wrote: > > > Hi all, > > > > > > I created a KIP to improve the scalability and latency of FetchRequest: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-227%3A+Introduce+Incremental+FetchRequests+to+Increase+Partition+Scalability > > > > > > Please take a look. > > > > > > cheers, > > > Colin > >