Thanks for all the replies. I've updated the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-81%3A+Bound+Fetch+memory+usage+in+the+consumer The main point is to selectively read from sockets instead of throttling FetchRequests sends. I also mentioned it will be reusing the MemoryPool implementation created in KIP-72 instead of adding another memory tracking method.
Please have another look. As always, comments are welcome ! On Thu, Nov 10, 2016 at 2:47 AM, radai <radai.rosenbl...@gmail.com> wrote: > selectively reading from sockets achieves memory control (up to and not > including talk of (de)compression) > > this is exactly what i (also, even mostly) did for kip-72 - which i hope in > itself should be a reason to think about both KIPs at the same time because > the changes will be similar (at least in intent) and might result in > duplicated effort. > > a pool API is a way to "scale" all the way from just maintaining a variable > holding amount of available memory (which is what my current kip-72 code > does and what this kip proposes IIUC) all the way up to actually re-using > buffers without any changes to the code using the pool - just drop in a > different pool impl. > > for "edge nodes" (producer/consumer) the performance gain in actually > pooling large buffers may be arguable, but i suspect for brokers regularly > operating on 1MB-sized requests (which is the norm at linkedin) the > resulting memory fragmentation is an actual bottleneck (i have initial > micro-benchmark results to back this up but have not had the time to do a > full profiling run). > > so basically I'm saying we may be doing (very) similar things in mostly the > same areas of code. > > On Wed, Nov 2, 2016 at 11:35 AM, Mickael Maison <mickael.mai...@gmail.com> > wrote: > >> electively reading from the socket should enable to >> control the memory usage without impacting performance. I've had look >> at that today and I can see how that would work. >> I'll update the KIP accordingly tomorrow. >>