Hi Pulsar Community, Here are the notes from the community meeting two weeks ago. We had several very good discussions.
I want to emphasize that we do not make any official decisions during these meetings. It is an essential part of the Apache Way [0] to make project decisions on the dev mailing list. I share the meeting notes to archive insights, context, and ideas from the meetings that might hopefully be valuable as we work to improve Pulsar. Disclaimer: if something is misattributed or misrepresented, please send a correction to this list. Source google doc: https://docs.google.com/document/d/19dXkVXeU2q_nHmkG8zURjKnYlvD96TbKf5KjYyASsOE [0] https://www.apache.org/theapacheway/ Thanks, Michael - 2022/10/13, (8:30 AM PST) - Attendees: - Matteo Merli - Michael Marshall - Lari Hotari - Dave Fisher - Andrey Yegorov - Rajan Dhabalia - Discussions/PIPs/PRs (Generally discussed in order they appear) - Michael: what is the current flow for getting PRs to run CI? Lari: CI will run when a PR is either (approved or has the ready-to-test label) and the “/pulsarbot rerun-failure-scenarios” comment is added to the PR. Also, we have had the flaky tests failing for a while and we don’t have enough attention on the flaky tests. We do seem to have more resources. Dave: we could could use ASF infra. Matteo: we switched because the experience there was terrible. Dave: there have been a lot of updates in the past couple years, though I haven’t tested it. Matteo: also, switching CI jobs is a ton of work. There was resource contention previously on the ASF infra. Dave: you can get your own dedicated works. Companies do fund project related nodes. Matteo: you can also do this with GitHub actions. Dave: there are some security concerns. When will the ASF infra trust runners. Matteo: one of the things in GitHub actions is to have MacOS and Windows runners. Michael: the flaky test reports are really helpful. Lari: that is something I handed off to Nicolò. At Apache Con, Gradle announced that there will be gradle enterprise integration. This will give us live reports on the build, flaky tests, and more. Matteo: is it a runner or uploading the build data? Lari: it is an upload. Dave: there is also a speculative test run feature. Matteo: we’d need to build with gradle. Lari: they have maven plugins that integrate into maven. Matteo: I would still support switching to gradle. There were issues in the bookkeeper project. Lari: I was late to it, but I was disappointed to see that initiative fall through. Matteo: I feel similarly. We’ll need strong support from multiple committers to get it done. It would improve developer experience. It would be a lot of work and we need to decide that we want to get it done and we should be very committed to get it done. Dave: I’ve seen projects where there were big problems in the community and if the change doesn’t have complete buy in, it’ll cause fracture. Matteo: we also need to make sure that multiple people will push on the work. If there is enough community support, we can get this done. Rajan: is the main benefit of switching speed? Lari: one of the first benefits is that is has a correct incremental build. Many of the benefits are around that. Build caching is correct and consistent. Rajan: so the speed? Lari: it’s also the correctness. Often with maven, you need to do a clean install because you don’t know that it is correct. Andrey: was correctness the issue in the bookkeeper project? Matteo: it was related to the build. Andrey: it had to do with the binary artifacts and issues with how they were produced. Then, the people who contributed it were unavailable to support it. Matteo: that is why we want to make sure we have buy in. The other benefit is the preciseness of the dependencies. In maven, we are using indirect dependencies, so we don’t have a strong sense of what we need. With gradle, by default, you have to declare dependencies. This helps clarify the classpath. - Rajan: https://github.com/apache/pulsar/pull/17962 - blue/green cluster migration support. PIP 188. This is the first part which is most of the changes. The second part requires some other changes to maintain ordering. I plan to implement that once this first PR is merged. Please review it if you have time. Lari: this is late to share, but I came across another submission of a similar feature, but it wasn’t submitted to the project itself. https://github.com/pkumar-singh/pulsar/wiki/PIP-95:-Live-migration-of-producer-consumer-or-reader-from-one-Pulsar-cluster-to-another Matteo: that design is from splunk. Rajan: that was incomplete, and PIP 188 was made as a kind of successor. - Michael: just want to check in on https://github.com/apache/pulsar/issues/18012. There was a question about multiple zookeeper instances (see issue for context). Will that be a blocker for this PIP? Matteo: it should not be a blocker. That is a non-standard deployment (though some users do take that approach). Making a consistent default makes sense. - Lari: I want to talk about the mailing list discussion on breaking changes. I agree that we shouldn’t have major breaking changes. Rajan: I want to be clear that we talk about the changes. I wasn’t at Apache Con to have an offline discussion. These breaking changes sound scary. Matteo: I was not at Apache Con and did not have any of those meetings. My fear with large breaking changes is that you can get feature creep because that can lead to many different changes since it is the only time to make such changes. That can increase the risk of unintentional breakage due to concurrent changes and that will inevitably delay the releases. Let’s break each of these changes down instead of making them a big 3.0 release. Deliver each of these as feature updates individually. For example, the load balancer changes Heesung is working on will be added and non-default for a while as it is hardened. My point is that we shouldn’t tie it all together. Let’s add features incrementally. Dave: that is a reasonable approach to managing risk in change. If a group of the committers wanted to move forward with the proof of concept with ideas, where would we do that. Should we have a feature branch or do it in a fork? Where does the community prefer that activity to occur? Matteo: in either case, a non-trivial change will take weeks or months to stabilize and keep it up to date and to manage the conflicts with the branch. My suggestion for long experiments is to make the interface plugable. If the current abstraction is not the right one, expand the interface. Let’s try to make the interface better. Then you can do whatever you want within the interface without risk of divergence. You could even add an experiment into master without risk. Lari: I'm planning to make PIP-45, PIP-157 and PIP-192 obsolete in my PoC. The PoC will prove that it simplifies the solution and provides extreme results. I'm targeting 100 million topic inventory in the PoC. Anyone from the community is welcome to join the PoC. Matteo: what you describe, I couldn’t see how it would work. There are multiple other things that you need to get to the hundred million topics. There are multiple things breaking. Even going to 5 or 10 million topics requires breaking. For example, where are you going to store the metadata? I agree with experimentation. Let’s lay down all of the steps. Load manager is one. We can have different implementations. The other problem is the metrics. The amount of memory we need to get the metrics out is high. Here is a thread from Asaf: https://lists.apache.org/thread/comd0o5760fcgc8qn5d0s7bs7c3zs1j3. Michael: can we remove metrics? Matteo: see the thread, that is part of it. Andrey: imagine 50 bookkeeper ledgers per topic, trying to decommission a bookkeeper where there are a 100 million topics creates interesting problems. Lari: my POC will show how to get to the inventory. Andrey: I’m not sure it matters. The issue is that the auditor needs to scan all of the metadata. Matteo: there are a bunch of problems in the Pulsar broker. Metrics, load manager, the way that service discovery. We don’t know what is going to break. We need to know how we can test and verify functionality. I think that we can do this step by step and in a non-breaking way. Rajan: Pulsar is run as a shared cluster in companies where there are many independent teams, I think it is important to be able to upgrade the server without upgrading the clients. Matteo: that is always the case, especially if you consider geo-replication. You cannot break the protocol for that. We’ve broken the client api unintentionally, but I don’t think we have broken binary compatibility even unintentionally. Let’s find ways to make the changes backwards compatible. Rajan: can we try to document all of the changes somewhere? Matteo: let’s not make a waterfall release. We’re going to need to vote on the new release plan. This makes the 3.0 release long term support. I want to decouple this feature work and the 3.0 release. Dave: I think he wants features/pips that would be in that release. I think that is a fair enough document to have. Michael: I think we can definitely have a list, but I also think we’re at an early stage here. Matteo: looking at Java proposals, they do not tie larger proposals to a version. For example, project loom has been a multi-year effort and has continued to iterate and refine. I think we should use this as a guide. Java has shipped a lot of big changes very quickly in a way that has minimized the risk. Michael: I agree that we shouldn’t tie any of these features to a version. Lari: I agree. We’ve been talking at a meta level. One high level idea is that it does go in a similar direction as Kafka’s Kraft. When you remove a bundle, this helps. Matteo: first, kraft has been a 6 year effort and isn’t ready for production yet. It still has the same limitations as zookeeper. Lari: does etcd not also have these limitations. Matteo: yes. Lari: I am not cloning kraft. When we get rid of the namespace bundle, I think that is one of the reasons we’re limited at the moment. In this proof of concept, I can prove this, and if it fails, I can revise that. Talk is cheap. Matteo: my impression is that the goal of reaching 100 million topics, which I have that goal too, and it is a good goal, the question is how to get there? I do not view bundles as the main problem. Lari: I think it’s the problem, I think it will be necessary for making progress. That is my current assumption. A lot of things change when you start changing that, so it is not very easy. Matteo: I am still not sure how they will change, even from a theoretical point of view. Lari: I’ll be writing blog posts about that. I have worked on IoT device architectures with sharding. Matteo: I’ve seen many sharding solutions that work on paper that do not work for various reasons. There are also good reasons for doing other things. You get better on one thing, but maybe you make something else impossible to achieve. Lari: that is true that there are always trade offs. I think everything will fall into place once we change the namespace bundles. Matteo: I am skeptical, but I look forward to reading the blog posts. Lari: I think many people would come to this design. Matteo: just saying that the bundle concept was very intentional based on experience running distributed systems. Lari: I think there are many trade offs that were impossible to see. Matteo: you were mentioning a sharding per cluster, which will have certain limitations related to issues when adding nodes and certain other events. Lari: that is true with consistent hashing, and I think there are ways to make that happen online without disruption of service. Matteo: the other point is that even if you solve that, that isn’t the fundamental scalability issue. You still need to figure out where to store the data, how the client and brokers interact. Also, making PIP 45 obsolete was about making an abstraction for how we interact with zookeeper so that it wasn’t a hard coded dependency. It took a year and a half of effort to remove it out. Lari: the design is still a lot around the namespace bundle. Matteo: PIP is not related to the namespace bundle. It is just about putting the zk usage behind an abstraction. Lari: I agree on that. With this design, metadata is handled in a completely different way. Matteo: you use metadata in a different way, but you still need an interface, right? Lari: no, the idea is to think of metadata not as a separate entity, but to think about it as component state and state transitions. It has to do with data locality and single writer principles. The POC will not touch bookkeeper. Matteo: so you still need PIP 45. Lari: yes. Matteo: you still have the managed ledger. [Note taker missed some points] Matteo: I want to mention that I am open to reconsidering the design, but I also want to say the current design was thoughtful. Lari: Pulsar has been extremely successful. Matteo: sometimes you have limited time too. Let’s just make sure we get all of the context together and have all of the tradeoffs inserted into the discussion. My point is also, let’s get steps so that we can move towards 100 million topics.