Similarly to Lari, I hear your concerns about not breaking client APIs compatibility, but I share his view of being playful about the changes. IMO, this mindset is essential for brainstorming. When delivering we should then do that responsibly and according to a plan. The plan is the one ensuring that the compatibility is not at stake and describing the rollout phases. So, I read Lari's thread as "Pulsar Community, let's *responsibly* play".
But now, let's continue with the brainstorming. I am not sure my suggestion is appropriate or whether pulsar-perf already (perhaps partially) supports this, so feedback and / or pointers to relevant material are very appreciated. In my wishlist of pulsar tooling is a pulsar-perf subcommand allowing us to do more advanced E2E performance validation of the platform. Basically, the development of E2E client-side metrics, for example: 1. percentile latencies 2. message loss 3. message re-ordering Thank you, Max On Wed, 12 Oct 2022 at 13:18, Enrico Olivelli <eolive...@gmail.com> wrote: > Il giorno mer 12 ott 2022 alle ore 00:40 Matteo Merli > <matteo.me...@gmail.com> ha scritto: > > > > Agree, though let's make separate discussions. Putting all random > > ideas into the same cauldron is a good recipe for making no one able > > to follow or see a common line. > > > > That's what I meant when I started the proposal of having 3.0 > > completely detached from "features". > > > > If you start making a big container, you're going to fill it up with > > all the "breaking changes" that you want to include, because "hey, > > that's the only window of opportunity". That in turn is the surest way > > to not ship anything for the next 24 months, as all the changes are > > unavoidably going to get delayed and will take a long time to > > stabilize. > > > > Going back to API breakages: > > 1. We never break wire protocol compatibility > > 2. We try to never break client API > > 3. We need a very good & compelling reason in order to break client API > > 4. When we do so, we need to provide a clear path for users (eg: > > pulsar-client-1.x compatibly drop-in). > > > I totally agree with Matteo and Joe, > We cannot break compatibility. > > and if we need to introduce some new form of inter-broker communication > protocol > we must support the current protocol and provide a smooth upgrade path. > > Now there are many Pulsar clusters around the world that cannot > tolerate stop-the-world upgrades > and we MUST also allow some sort of rollback in case of problems. > > > > Enrico > > > > > > > > > > > > > > > -- > > Matteo Merli > > <matteo.me...@gmail.com> > > > > On Tue, Oct 11, 2022 at 11:58 AM Dave Fisher <w...@apache.org> wrote: > > > > > > Let’s discuss any and all ideas for improvement. As each is discussed > we can figure out how to make them non-breaking, We all want Pulsar to > improve. > > > > > > We should encourage an open discussion where no idea is automatically > bad or wrong. They can just be discussed without fear. > > > > > > Thanks, > > > Dave > > > > > > > On Oct 10, 2022, at 3:05 PM, Joe F <joefranc...@gmail.com> wrote: > > > > > > > > I would prefer that we avoid using the term “breaking changes”, > which is > > > > too vague to convey any specific meaning. So let me try to bring some > > > > clarity. > > > > > > > > > > > > There have been many changes to implementations, APIs and data > storage > > > > formats in Pulsar (and book keeper also). I have deployed many of > these > > > > changes to production. And I know that Matteo and Rajan (and > others too, > > > > about whom I’m not up to date on) have implemented and deployed > many such > > > > changes. But none of those changes ever required taking the system > > > > offline. NONE. > > > > > > > > > > > > Pulsar was developed as a 24x7x365 system, and rolling upgrades and > > > > rollbacks were a given. Like “this is water”, there was no special > callout > > > > needed for declaring this reality. No change, including enhancements > to > > > > wire protocols, broke client compatibility. Existing clients > continued to > > > > work; they may not be able to use all the new features. Use of new > features > > > > would require the app to be rebuilt anyway. (Checksums, e2e > encryption are > > > > examples) > > > > > > > > > > > > We have even succeeded in getting Pulsar adopted for some use > cases, just > > > > because the complexity of upgrading from K’s old clients to new ones > were > > > > costly enough to allow consideration of an alternative like Pulsar. > The > > > > business cost of forcing a client upgrade can be significant, to > the point > > > > of this being unviable for business. That just cannot be > hand-waved over > > > > > > > > > > > > There have also been changes in storage formats(the ZK metadata > change from > > > > text to binary is an example). But through all such changes, > compatibility > > > > and upgradeability has been a given. There has never been a > situation where > > > > a live Pulsar upgrade was not possible, and a coordinated client > upgrade > > > > was mandatory. > > > > > > > > > > > > So the question should not be about whether “signifcant” changes > should > > > > be made or not. Changes can be made and released in a way that > breaks > > > > *business*, or they can be made in a way that lets businesses sail > > > > smoothly through that change. So the question is about how such > changes > > > > gets rolled out. > > > > > > > > > > > > And to that question, my strong opinion is that any change that does > not > > > > allow a live/rolling upgrade or rollback, or anything that forces a > client > > > > to upgrade just to continue functioning, is a non-starter. All > changes > > > > can be made in a compatible, phased manner, and in a way that does > not > > > > penalise older versions ( older versions doing worse on new > releases is > > > > also not an acceptable way of making changes) Changes can be made > in a > > > > manner that make l A/B testing possible by the user, with limited > risk, and > > > > then choosing to a not go back. It has all been done in Pulsar > before. > > > > > > > > > > > > Would that be harder than just breaking stuff? Yes. But that is > far more > > > > preferable than forcing users to take a hit. > > > > > > > > > > > > -joe > > > > > > > > On Sat, Oct 8, 2022 at 1:25 PM Rajan Dhabalia <rdhaba...@apache.org> > wrote: > > > > > > > >> I would say first we should gather a list of changes which we want > to > > > >> target and find out which improvements really need major version > release. > > > >> We can take the Pulsar-1.0 to Pulsar-2.0 upgrade example to avoid > major > > > >> interruption and impact on existing systems and still achieve our > goal. So, > > > >> the first step is discovery of such features and then we can > discuss how to > > > >> introduce them in Pulsar with minimum impact on existing systems. > > > >> > > > >> Thanks, > > > >> Rajan > > > >> > > > >> On Sat, Oct 8, 2022 at 1:05 PM Devin Bost <devin.b...@gmail.com> > wrote: > > > >> > > > >>> I'm noticing some pushback on the idea of pre-emptively proposing > any > > > >> kind > > > >>> of breaking upgrade that would necessitate cutting a 3.0 release. > > > >>> I do understand the concern about introducing a breaking change... > For a > > > >>> distributed messaging application like Pulsar, if clients needed > to be > > > >>> simultaneously upgraded with brokers, that could be extremely > difficult > > > >> or > > > >>> infeasible for companies to coordinate without treating it like a > > > >> migration > > > >>> to a new technology. > > > >>> > > > >>> At the same time, do we want to be completely closed to the > possibility > > > >>> that a breaking change could be required at some point in the > future? If > > > >> a > > > >>> circumstance like that appears, those are the kinds of situations > that > > > >> can > > > >>> lead to a fork. Are there certain kinds of breaking changes that > are more > > > >>> acceptable than others? > > > >>> > > > >>> Also, if the forward looking plan is to never introduce breaking > changes, > > > >>> when *would* we ever cut a Pulsar 3.x release? Do we have any > criteria > > > >> on > > > >>> what kinds of changes would necessitate cutting a new major > release but > > > >>> would still be considered acceptable by the community? > > > >>> > > > >>> -- > > > >>> Devin Bost > > > >>> Sent from mobile > > > >>> Cell: 801-400-4602 > > > >>> > > > >>> On Sat, Oct 8, 2022, 2:14 PM Rajan Dhabalia <rdhaba...@apache.org> > > > >> wrote: > > > >>> > > > >>>> This sounds like the current state of Apache Pulsar has a lot of > issues > > > >>> and > > > >>>> it requires fundamental design changes to make it promising which > is > > > >>>> definitely not true and I disagree with it. And I would be careful > > > >>>> comparing with Kafka as I still don't think the Kafka release has > > > >>> anything > > > >>>> to do with Pulsar's improvement. I would still recommend to list > down > > > >> all > > > >>>> the changes at one place so we can bring everyone on the same > page. > > > >>> discuss > > > >>>> as a community and we make sure existing usecases continue using > Pulsar > > > >>> and > > > >>>> not try to find Pulsar alternatives with incorrect disruption > > > >> impression > > > >>>> and efforts they might have to put to upgrade or maintain pulsar. > > > >>>> > > > >>>> Thanks, > > > >>>> Rajan > > > >>>> > > > >>>> On Fri, Oct 7, 2022 at 7:49 PM Lari Hotari <lhot...@apache.org> > wrote: > > > >>>> > > > >>>>> We could all have our own favorite names for this work. :) > > > >>>>> > > > >>>>> There's advice that you should disrupt yourself before someone > > > >> disrupts > > > >>>>> you. > > > >>>>> Shouldn't we follow that advice for Apache Pulsar? We can disrupt > > > >>> Pulsar > > > >>>>> together with our Apache hats on. The catch is that since we are > > > >> doing > > > >>>>> this, we will be able to learn and improve Pulsar so that we stay > > > >> ahead > > > >>>> of > > > >>>>> competition. Pulsar was long ways ahead of competition for so > many > > > >>> years, > > > >>>>> but Kafka is finally catching up. Did Kafka surpass Pulsar in > some > > > >>>> aspects > > > >>>>> with the recent 3.3 release, where Kraft became GA? That's a > question > > > >>>> that > > > >>>>> many might be asking. Why wouldn't we rev up Pulsar's engine and > show > > > >>> the > > > >>>>> tail lights to Kafka? > > > >>>>> > > > >>>>> We don't have to have deadlines or any restrictions like that > right > > > >>> now. > > > >>>>> The sky's the limit. > > > >>>>> Linus Torvalds has written a book called "Just for fun". I got my > > > >> copy > > > >>> of > > > >>>>> this book signed by Linus himself in year 2000 at an event that > the > > > >>> book > > > >>>>> publisher had organized in Finland. > > > >>>>> > > > >>>>> What if we did this "just for fun"? The intention could also be > to > > > >> beat > > > >>>>> Kafka, but that could be a boring goal for many. What if we could > > > >>> unleash > > > >>>>> some talent that is among us and hasn't had a chance to show its > full > > > >>>>> potential? Opensource is about joy. It is about welcoming > everyone to > > > >>>> join. > > > >>>>> Opensource should be egoless, although we must all admit that we > > > >> don't > > > >>>>> succeed in that aspect. We must fight our biases. > > > >>>>> > > > >>>>> Jarek Potiuk explains the importance of being welcoming for > success > > > >> at > > > >>>>> Apache, in a 3-minute YouTube interview: > > > >>>>> https://www.youtube.com/watch?v=Dx5kQnVFo7E > > > >>>>> This interview is about Jarek's blog post "Success at Apache: > > > >> Welcoming > > > >>>>> communities strengthens the Apache way": > > > >>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > https://news.apache.org/foundation/entry/success-at-apache-welcoming-communities > > > >>>>> I was pleased to meet Jarek at ApacheCon among so many other > > > >> welcoming > > > >>>>> personalities of the Apache community and the Apache Pulsar > > > >> community. > > > >>>>> > > > >>>>> Goals have to be ambitious. What if we set the bar really high? > > > >>>>> Apache Pulsar with 10 million topics in a cluster? > > > >>>>> Why not go up to 100 million topics? > > > >>>>> Just for fun. :) > > > >>>>> > > > >>>>> -Lari > > > >>>>> > > > >>>>> On 2022/10/07 22:53:59 Matteo Merli wrote: > > > >>>>>> I actually disagree with the term "Pulsar Next Gen", because I > > > >>> haven't > > > >>>>>> seen any proposal for which that would make sense to me to be > > > >> called > > > >>>>>> so. > > > >>>>>> > > > >>>>>> Rajan: That's the whole point of breaking it down. If you > > > >> accumulate > > > >>>>>> many "big" changes it introduces a lot of risk for instabilities > > > >> and > > > >>>>>> incompatibilities. Breaking it down in multiple steps helps to > see > > > >>> the > > > >>>>>> incremental changes and introduced them in a phased manner. > > > >>>>>> > > > >>>>>> > > > >>>>>> -- > > > >>>>>> Matteo Merli > > > >>>>>> <matteo.me...@gmail.com> > > > >>>>>> > > > >>>>>> On Fri, Oct 7, 2022 at 3:37 PM Rajan Dhabalia < > > > >> rdhaba...@apache.org> > > > >>>>> wrote: > > > >>>>>>> > > > >>>>>>> Hi, > > > >>>>>>> > > > >>>>>>> Can we get the list of changes at one place which we are > planning > > > >>> to > > > >>>>> get as > > > >>>>>>> part of 3.0. One thing I would like to see as a part of a major > > > >>>>> release, it > > > >>>>>>> CAN NOT impact existing usecases and users in any way which can > > > >>> force > > > >>>>> them > > > >>>>>>> to upgrade the client library. Applications using < 3.0 version > > > >>>> should > > > >>>>>>> continue getting all the client and server side enhancements > and > > > >>> bug > > > >>>>> fixes. > > > >>>>>>> Failing to provide bug-fixes and features to client < 3.0 means > > > >> we > > > >>>> are > > > >>>>>>> forcing them to upgrade client version by putting efforts to > > > >> handle > > > >>>> all > > > >>>>>>> incompatibility. and that's something we should definitely > > > >> prevent > > > >>>>> because > > > >>>>>>> Apache Pulsar is used by many large scale business usecases and > > > >> we > > > >>>>> should > > > >>>>>>> accommodate and motivate them to continue using Apache Pulsar. > > > >>>>>>> I understand as a Pulsar community we should always try to > > > >> progress > > > >>>> and > > > >>>>>>> build better but not at the cost of losing or reducing the > Apache > > > >>>>> Pulsar > > > >>>>>>> community. > > > >>>>>>> > > > >>>>>>> Thanks, > > > >>>>>>> Rajan > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On Fri, Oct 7, 2022 at 12:41 PM Lari Hotari < > lhot...@apache.org> > > > >>>>> wrote: > > > >>>>>>> > > > >>>>>>>> Thank you, Matteo. I agree that features should be delivered > > > >>>>> continuously > > > >>>>>>>> when that is possible. In this case, that might not apply. > > > >>>>>>>> > > > >>>>>>>> I also agree that calling this Pulsar 3.0 isn't necessarily > > > >>> aligned > > > >>>>> with > > > >>>>>>>> PIP-175 since an LTS release is when the major version is > > > >> bumped. > > > >>>>> I'm fine > > > >>>>>>>> in calling this "Pulsar Next Gen" or something that calls out > > > >>> that > > > >>>>> this is > > > >>>>>>>> planning for making a major leap in Pulsar. > > > >>>>>>>> > > > >>>>>>>> There are several unresolved issues with PIP-45 and the Pulsar > > > >>> Load > > > >>>>>>>> balancer. The previously referred email threads contain a lot > > > >> of > > > >>>>> context to > > > >>>>>>>> this. Resolving the issues efficiently will most likely result > > > >> in > > > >>>>> breaking > > > >>>>>>>> changes, which will be the reason why it deserves a major > > > >> version > > > >>>>> upgrade. > > > >>>>>>>> > > > >>>>>>>> We have discussed it before that it's crucial to have a path > to > > > >>>>> migrate > > > >>>>>>>> users when there are breaking changes. This should be covered > > > >> in > > > >>>> any > > > >>>>> of the > > > >>>>>>>> solutions that are introduced. Optimally, users of Pulsar > would > > > >>> be > > > >>>>> able to > > > >>>>>>>> upgrade seamlessly to Pulsar Next Gen / Pulsar 3.0, but > rolling > > > >>>> back > > > >>>>> might > > > >>>>>>>> not be directly supported. > > > >>>>>>>> > > > >>>>>>>> I am welcoming everyone to join this planning for the Apache > > > >>> Pulsar > > > >>>>> Next > > > >>>>>>>> Gen architecture. Please check the first email in this thread > > > >> for > > > >>>>> details > > > >>>>>>>> of context, and start participating and contributing today. > The > > > >>>> best > > > >>>>> way to > > > >>>>>>>> contribute is to participate in the email threads, since they > > > >>>> contain > > > >>>>>>>> details with better context. > > > >>>>>>>> > > > >>>>>>>> -Lari > > > >>>>>>>> > > > >>>>>>>> On 2022/10/07 18:03:00 Matteo Merli wrote: > > > >>>>>>>>> Given the past experiences and the discussions that already > > > >>>>> happened > > > >>>>>>>>> around "PIP-175: Extend time based release process", the idea > > > >>> is > > > >>>> to > > > >>>>>>>>> detach the 3.0 from "big-features" items or "incompatible > > > >>>> changes". > > > >>>>>>>>> > > > >>>>>>>>> The changes are going to get included as they are ready, > > > >> within > > > >>>>>>>>> feature releases, and in a fully compatible way. We don't > > > >> need > > > >>> to > > > >>>>>>>>> group them together and create unnecessary risk for the > > > >> release > > > >>>>>>>>> schedule and the users. > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> -- > > > >>>>>>>>> Matteo Merli > > > >>>>>>>>> <matteo.me...@gmail.com> > > > >>>>>>>>> > > > >>>>>>>>> On Fri, Oct 7, 2022 at 10:47 AM Lari Hotari < > > > >>> lhot...@apache.org> > > > >>>>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>> Hi all, > > > >>>>>>>>>> > > > >>>>>>>>>> Greeting from ApacheCon North America 2022 from New > > > >> Orleans! > > > >>>>>>>>>> We had a great conference with a dedicated Pulsar track. > > > >>> Thanks > > > >>>>> to all > > > >>>>>>>> presenters and everyone who attended. The talks weren't > > > >> recorded, > > > >>>>> but the > > > >>>>>>>> slides will be later on posted on the conference website [1]. > > > >>>>>>>>>> > > > >>>>>>>>>> At ApacheCon there were several presentations about "the > > > >>> Apache > > > >>>>> way" > > > >>>>>>>> and what that means in practice. Based on that, we all know > > > >> that > > > >>> no > > > >>>>> person > > > >>>>>>>> is nominated as the CTO of Apache Pulsar who decides on Pulsar > > > >>> 3.0 > > > >>>>> and when > > > >>>>>>>> that happens. It's us, the community, that serve that role > > > >>>> together. > > > >>>>> We > > > >>>>>>>> come together as individuals with the Apache hat on. Everyone > > > >> is > > > >>>>> equal in > > > >>>>>>>> the community, regardless of whether they are contributors, > > > >>>>> committers or > > > >>>>>>>> PMC members. > > > >>>>>>>>>> We welcome everyone to participate. The small detail about > > > >>>> voting > > > >>>>>>>> shouldn't stop anyone from participating in any aspects of the > > > >>>>> planning for > > > >>>>>>>> the roadmap. > > > >>>>>>>>>> > > > >>>>>>>>>> I'll like to get the discussions going for Pulsar 3.0. We > > > >>> don't > > > >>>>> need a > > > >>>>>>>> separate decision to start planning that. Please correct me if > > > >>> I'm > > > >>>>> wrong or > > > >>>>>>>> if you have a different opinion. > > > >>>>>>>>>> > > > >>>>>>>>>> There are a few previous discussion threads that are > > > >> related > > > >>> to > > > >>>>> Pulsar > > > >>>>>>>> 3.0 planning. > > > >>>>>>>>>> If you are interested in getting involved with Apache > > > >> Pulsar > > > >>>> 3.0 > > > >>>>>>>> planning, I think that it makes sense for you to read these > > > >>> threads > > > >>>>>>>> carefully and reply to them. Please also suggest what you > think > > > >>>>> makes sense. > > > >>>>>>>>>> > > > >>>>>>>>>> PIP-45 related: > > > >>>>>>>> > > > >> https://lists.apache.org/thread/tvco1orf0hsyt59pjtfbwoq0vf6hfrcj > > > >>>>>>>>>> Pulsar Load balancer / namespace bundle related: > > > >>>>>>>>>> > > > >>>> https://lists.apache.org/thread/roohoc9h2gthvmd7t81do4hfjs2gphpk > > > >>>>>>>>>> renaming topics: > > > >>>>>>>>>> > > > >>>> https://lists.apache.org/thread/vrr75rrh4trqlp14objh3snlfvmzdrp2 > > > >>>>>>>>>> backpressure: > > > >>>>>>>> > > > >> https://lists.apache.org/thread/v7xy57qfzbhopoqbm75s6ng8xlhbr2q6 > > > >>>>>>>>>> > > > >>>>>>>>>> Long list of Metadata inconsistency issues by Zac Bentley: > > > >>>>>>>>>> https://github.com/apache/pulsar/issues/12555 > > > >>>>>>>>>> That would be a good starting point to understanding the > > > >> data > > > >>>>>>>> inconsistency issues related to current PIP-45 design. Perhaps > > > >>>> those > > > >>>>> could > > > >>>>>>>> be addressed already before Pulsar 3.0? > > > >>>>>>>>>> > > > >>>>>>>>>> I'm looking forward to everyone's participation in the > > > >> Apache > > > >>>>> Pulsar > > > >>>>>>>> 3.0 planning discussions. > > > >>>>>>>>>> > > > >>>>>>>>>> Best Regards, > > > >>>>>>>>>> > > > >>>>>>>>>> -Lari > > > >>>>>>>>>> > > > >>>>>>>>>> 1 - https://www.apachecon.com/acna2022/schedule.html > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > > > >