Hi Zahari, Oops. We had planned to put this patch upstream but somehow slipped my mind. We were recently going over hotfixes that we have and this seemed something that had been due for sometime now. Glad to know that someone else apart from us might also benefit from this :)
Thanks, Mayuresh On Thu, Oct 25, 2018 at 12:25 PM Zahari Dichev <zaharidic...@gmail.com> wrote: > Hi there Mayuresh, > > Great to heat that this is actually working well in production for some > time now. I have changed the details of the KIP to reflect the fact that as > already discussed - we do not really need any kind of configuration as this > data should not be thrown away at all. Submitting a PR sounds great, > although I feel a bit jealous you (LinkedIn) beat me to my first kafka > commit ;) Not sure how things stand with the voting process ? > > Zahari > > > > On Thu, Oct 25, 2018 at 7:39 PM Mayuresh Gharat < > gharatmayures...@gmail.com> > wrote: > > > Hi Colin/Zahari, > > > > I have created a ticket for the similar/same feature : > > https://issues.apache.org/jira/browse/KAFKA-7548 > > We (Linkedin) had a use case in Samza at Linkedin when they moved from > the > > SimpleConsumer to KafkaConsumer and they wanted to do this pause and > resume > > pattern. > > They realized there was performance degradation when they started using > > KafkaConsumer.assign() and pausing and unPausing partitions. We realized > > that not throwing away the prefetched data for paused partitions might > > improve the performance. We wrote a benchmark (I can share it if needed) > to > > prove this. I have attached the findings in the ticket. > > We have been running the hotfix internally for quite a while now. When > > samza ran this fix in production, they realized 30% improvement in there > > app performance. > > I have the patch ready on our internal branch and would like to submit a > PR > > for this on the above ticket asap. > > I am not sure, if we need a separate config for this as we haven't seen a > > lot of memory overhead due to this in our systems. We have had this > running > > in production for a considerable amount of time without any issues. > > It would be great if you guys can review the PR once its up and see if > that > > satisfies your requirement. If it doesn't then we can think more on the > > config driven approach. > > Thoughts?? > > > > Thanks, > > > > Mayuresh > > > > > > On Thu, Oct 25, 2018 at 8:21 AM Colin McCabe <cmcc...@apache.org> wrote: > > > > > Hi Zahari, > > > > > > One question we didn't figure out earlier was who would actually want > > this > > > cached data to be thrown away. If there's nobody who actually wants > > this, > > > then perhaps we can simplify the proposal by just unconditionally > > retaining > > > the cache until the partition is resumed, or we unsubscribe from the > > > partition. This would avoid adding a new configuration. > > > > > > best, > > > Colin > > > > > > > > > On Sun, Oct 21, 2018, at 11:54, Zahari Dichev wrote: > > > > Hi there, although it has been discussed briefly already in this > thread > > > > < > > > > > > https://lists.apache.org/thread.html/fbb7e9ccc41084fc2ff8612e6edf307fb400f806126b644d383b4a64@%3Cdev.kafka.apache.org%3E > > > >, > > > > I decided to follow the process and initiate a DISCUSS thread. > Comments > > > > and > > > > suggestions are more than welcome. > > > > > > > > > > > > Zahari Dichev > > > > > > > > > -- > > -Regards, > > Mayuresh R. Gharat > > (862) 250-7125 > > > -- -Regards, Mayuresh R. Gharat (862) 250-7125