Thanks Lari for sharing Enrico
Il Ven 24 Nov 2023, 11:03 Lari Hotari <lhot...@apache.org> ha scritto: > Pulsar Community Meeting minutes 2023/11/23 > > Notice: Draft minutes pending review - please suggest any corrections or > additions by replying to this email thread. > > - Attendees: > - Girish Sharma > - YuWei Sung > - Apurva T > - Asaf Mesika > - Lari Hotari > - Chris Bono > > - Agenda > > - PIP-310 and rate limiting improvements > > - Pulsar Rate Limiter requirements by Girish - > > https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc > > - Lari to present summary of views on PIP-310. This is documented > in the blog post “Apache Pulsar service level objectives and > rate limiting” > < > https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html > > > . Please read the blog post before the meeting as a > preparation > > - Meeting Minutes: > > - Girish presented the background and the problem with the current > rate limiters by going over the Pulsar Rate Limiter document > < > https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc > >. > The conclusion is that there’s a need for supporting bursting > while keeping the allowed bursting on a single broker under the > limit of what the broker can do. > > - Related to how the combined bursting of all topics in a broker > could be kept under the limits of a broker, Lari added that in > Confluent Kora, there's a concept called dynamic quota > management that is described in the Kora paper section 5.2.2 > <http://vldb.org/pvldb/vol16/p3822-povzner.pdf#page=11>: "Kora > addresses this issue by using a dynamic quota mechanism that > adjusts bandwidth distribution based on a tenant’s bandwidth > consumption." > > - While bursting, the remaining available capacity on the broker > could be proportionally split based on the configured topic > rates. > > - Girish added that in their case, the topics that should be > prioritized in bursting aren’t the ones with the highest > throughput. > > - There would be a need to have SLA/SLO (Service Level Objective) > metadata for topics in the future that would help Pulsar > making proper prioritization decisions in these types of > scenarios. > > - Girish continued explaining the details of rate limiting bursting > requirements by going over the document. There are very > valuable findings and observations that will be very helpful in > improving the Pulsar rate limiting solution. Girish has taken > an approach in the document where it goes beyond PIP-310 to > explain the requirements from his organization’s perspective. > > - After going over Girish's Pulsar Rate Limiter document, there was > a discussion about the next steps for proceeding forward. > > - There was a consensus that the default (“polling”) rate limiter > option in Pulsar is unusable in practice and this needs to be > addressed in the Pulsar core (see Girish’s analysis in the > document section “4.1 Existing pulsar rate limiter” > < > https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc/edit#heading=h.nx692qsf70id > >). > > - The group discussed the next steps in order to make progress. > There are two separate areas of work: addressing the issues > with the Pulsar default rate limiters and the other one is > addressing the requirements that Girish brought up in his > presentation over the Pulsar Rate Limiter document > < > https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc > >. > > - Lari presented his view to address the issue in the Pulsar default > rate limiters based on his blog post “Apache Pulsar service > level objectives and rate limiting section “Problems to address > as the next step” > < > https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html#problems-to-address-as-the-next-step > >. > > - The first goal is to reach feature parity with the current rate > limiters in Pulsar without introducing breaking changes. > > - Instead of adding more feature flags to clutter the code base > and add more complexity, this would be handled as a > refactoring where the existing internal solution in the > Pulsar code base is replaced with the new solution that > addresses the problems explained in the blog post. > > - The replacement solution for the refactoring has already been > sufficiently validated (explained in the blog post > < > https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html#problems-to-address-as-the-next-step > >) > so that there’s confidence to move forward. > > - There was a question whether this change could be implemented with > a feature flag instead of handling it as a refactoring where > old code gets deleted and removed. > > - Lari thinks that this would be a bad idea in this case since it > would increase complexity in the code base, and it would > make it even harder to maintain the code base in the future. > He would rather solve this by creating a minimal refactoring > PR that reaches feature parity with the existing solution in > a single PR. > > - There was a discussion that it would be a hard PR to review > because it could be a large change since the current rate > limiting touches many parts of the code base. > > - It was then discussed if a PIP should first be made before > starting to make further changes towards this direction. > > - There was a discussion about the PIP process. Lari said that > the process could be adjusted when it is needed. In this > case, Lari is planning to proceed by first creating a PR in > draft mode before writing a PIP. Lari’s opinion is that PIPs > could also be created in a different order when it makes > sense. In Apache projects, the Pulsar dev mailing list is > the place where decisions are made eventually. There was a > long discussion about the tradeoffs of PIPs and the process. > (I’m sorry that I couldn’t capture that to meeting notes. > Someone also mentioned that Lari’s blog post is already > almost a PIP.) > > - Lari explained that by creating the draft PR, it would also > show the extent of the required changes. Analyzing the > required changes without doing actual changes is not > practical in this case. > > - **Conclusion 1**: Lari will attempt to create a PR for the > Pulsar rate limiting refactoring changes in draft mode, and > then proceed to create a PIP that covers the refactoring. > The main reason a PIP is needed for this change is that it > is a large code change touching multiple components, as > required by the PIP process guidelines. (PIP process > <https://github.com/apache/pulsar/blob/master/pip/README.md>). > > - **Conclusion 2:** For Girish’s requirements for rate limiting, it > was agreed that Girish would start a “parent PIP” which focuses > on describing the Pulsar rate limiter requirements (outcomes) > and the problem instead of the solution. Child PIPs could > follow. > > The next meeting will be held on December 7th, 2023. Everyone is welcome > to join. Here is the Pulsar Community Meeting calendar, which includes > the Zoom link: https://github.com/apache/pulsar/wiki/Community-Meetings. > Please add your agenda proposals to the meeting minutes document. You > can find the link to this document on the community meetings page. >