Hi everyone, I want to revive a solution to this issue. I created a new PR that accomodates Jason Gustafson's suggestion in the original PR to re-add paused completed fetches back to the completed fetches queue for less bookeeping. If someone could jump in and do a review it would be appreciated!
I also updated KAFKA-7548 with more information. Patch: https://github.com/apache/kafka/pull/6988 (original: https://github.com/apache/kafka/pull/5844) Jira: https://issues.apache.org/jira/browse/KAFKA-7548 Sample project: https://github.com/seglo/kafka-consumer-tests/tree/seglo/KAFKA-7548 Grafana snapshot: https://snapshot.raintank.io/dashboard/snapshot/RDFTsgNvzP5bTmuc8X6hq7vLixp9tUtL?orgId=2 Regards, Sean On Wed, Oct 31, 2018 at 8:41 PM Zahari Dichev <zaharidic...@gmail.com> wrote: > Just looked at it, > > Great work. Thanks a lot for the patch. This should certainly improve > things ! > > Zahari > > On Wed, Oct 31, 2018 at 6:25 PM <zaharidic...@gmail.com> wrote: > > > Hi there, I will take a look first thing i get home. > > > > Zahari > > > > > On 31 Oct 2018, at 18:23, Mayuresh Gharat <gharatmayures...@gmail.com> > > wrote: > > > > > > Hi Colin, Zahari, > > > > > > Wanted to check if you can review the patch and let me know, if we need > > to > > > make any changes? > > > > > > Thanks, > > > > > > Mayuresh > > > > > > On Fri, Oct 26, 2018 at 1:41 PM Zahari Dichev <zaharidic...@gmail.com> > > > wrote: > > > > > >> Thanks for participating the discussion. Indeed, I learned quite a > lot. > > >> Will take a look at the patch as well and spend some time hunting for > > some > > >> other interesting issue to work on :) > > >> > > >> Cheers, > > >> Zahari > > >> > > >>> On Fri, Oct 26, 2018 at 8:49 PM Colin McCabe <cmcc...@apache.org> > > wrote: > > >>> > > >>> Hi Zahari, > > >>> > > >>> I think we can retire the KIP, since the KAFKA-7548 patch should > solve > > >> the > > >>> issue without any changes that require a KIP. This is actually the > > best > > >>> thing we could do for our users, since things will "just work" more > > >>> efficiently without a lot of configuration knobs. > > >>> > > >>> I think you did an excellent job raising this issue and discussing > it. > > >>> It's a very good contribution to the project even if you don't end up > > >>> writing the patch yourself. I'm going to take a look at the patch > > today. > > >>> If you want to take a look, that would also be good. > > >>> > > >>> best, > > >>> Colin > > >>> > > >>> > > >>>> On Thu, Oct 25, 2018, at 12:25, Zahari Dichev wrote: > > >>>> Hi there Mayuresh, > > >>>> > > >>>> Great to heat that this is actually working well in production for > > some > > >>>> time now. I have changed the details of the KIP to reflect the fact > > >> that > > >>> as > > >>>> already discussed - we do not really need any kind of configuration > as > > >>> this > > >>>> data should not be thrown away at all. Submitting a PR sounds > great, > > >>>> although I feel a bit jealous you (LinkedIn) beat me to my first > kafka > > >>>> commit ;) Not sure how things stand with the voting process ? > > >>>> > > >>>> Zahari > > >>>> > > >>>> > > >>>> > > >>>> On Thu, Oct 25, 2018 at 7:39 PM Mayuresh Gharat < > > >>> gharatmayures...@gmail.com> > > >>>> wrote: > > >>>> > > >>>>> Hi Colin/Zahari, > > >>>>> > > >>>>> I have created a ticket for the similar/same feature : > > >>>>> https://issues.apache.org/jira/browse/KAFKA-7548 > > >>>>> We (Linkedin) had a use case in Samza at Linkedin when they moved > > >> from > > >>> the > > >>>>> SimpleConsumer to KafkaConsumer and they wanted to do this pause > and > > >>> resume > > >>>>> pattern. > > >>>>> They realized there was performance degradation when they started > > >> using > > >>>>> KafkaConsumer.assign() and pausing and unPausing partitions. We > > >>> realized > > >>>>> that not throwing away the prefetched data for paused partitions > > >> might > > >>>>> improve the performance. We wrote a benchmark (I can share it if > > >>> needed) to > > >>>>> prove this. I have attached the findings in the ticket. > > >>>>> We have been running the hotfix internally for quite a while now. > > >> When > > >>>>> samza ran this fix in production, they realized 30% improvement in > > >>> there > > >>>>> app performance. > > >>>>> I have the patch ready on our internal branch and would like to > > >> submit > > >>> a PR > > >>>>> for this on the above ticket asap. > > >>>>> I am not sure, if we need a separate config for this as we haven't > > >>> seen a > > >>>>> lot of memory overhead due to this in our systems. We have had this > > >>> running > > >>>>> in production for a considerable amount of time without any issues. > > >>>>> It would be great if you guys can review the PR once its up and see > > >> if > > >>> that > > >>>>> satisfies your requirement. If it doesn't then we can think more on > > >> the > > >>>>> config driven approach. > > >>>>> Thoughts?? > > >>>>> > > >>>>> Thanks, > > >>>>> > > >>>>> Mayuresh > > >>>>> > > >>>>> > > >>>>> On Thu, Oct 25, 2018 at 8:21 AM Colin McCabe <cmcc...@apache.org> > > >>> wrote: > > >>>>> > > >>>>>> Hi Zahari, > > >>>>>> > > >>>>>> One question we didn't figure out earlier was who would actually > > >> want > > >>>>> this > > >>>>>> cached data to be thrown away. If there's nobody who actually > > >> wants > > >>>>> this, > > >>>>>> then perhaps we can simplify the proposal by just unconditionally > > >>>>> retaining > > >>>>>> the cache until the partition is resumed, or we unsubscribe from > > >> the > > >>>>>> partition. This would avoid adding a new configuration. > > >>>>>> > > >>>>>> best, > > >>>>>> Colin > > >>>>>> > > >>>>>> > > >>>>>>> On Sun, Oct 21, 2018, at 11:54, Zahari Dichev wrote: > > >>>>>>> Hi there, although it has been discussed briefly already in this > > >>> thread > > >>>>>>> < > > >>>>>> > > >>>>> > > >>> > > >> > > > https://lists.apache.org/thread.html/fbb7e9ccc41084fc2ff8612e6edf307fb400f806126b644d383b4a64@%3Cdev.kafka.apache.org%3E > > >>>>>>> , > > >>>>>>> I decided to follow the process and initiate a DISCUSS thread. > > >>> Comments > > >>>>>>> and > > >>>>>>> suggestions are more than welcome. > > >>>>>>> > > >>>>>>> > > >>>>>>> Zahari Dichev > > >>>>>> > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> -Regards, > > >>>>> Mayuresh R. Gharat > > >>>>> (862) 250-7125 > > >>>>> > > >>> > > >> > > > > > > > > > -- > > > -Regards, > > > Mayuresh R. Gharat > > > (862) 250-7125 > > > -- Principal Engineer, Lightbend, Inc. <http://lightbend.com> @seg1o <https://twitter.com/seg1o>, in/seanaglover <https://www.linkedin.com/in/seanaglover/>