Dear Pulsar Community, As we prepare for new releases in our maintenance branches, we have once again encountered issues with our cherry-picking process. Some of our maintenance branches are currently broken or were recently broken, containing compilation errors or failing tests. Many have encountered these issues, as we have seen new PRs come in to address the problems. The compilation problems are already being addressed by Heesung (release manager for 3.0.3) and myself. We aim to resolve these issues as soon as possible. Please join #dev channel on Apache Pulsar Slack to collaborate in real time to help with this and get updates.
The cherry-picking process has always been problematic and lacks clear documentation in Apache Pulsar. This often leads to our maintenance branches breaking, especially as we approach release dates and begin cherry-picking fixes. This recurring issue has been the subject of multiple discussions over the years. The "feature freeze" in the release process does not mitigate the key problem with the cherry-picking approach. Furthermore, the cherry-picking process is mostly based on tribal knowledge and lacks clear documentation. I have previously expressed my concerns about this on the mailing list in this thread: https://lists.apache.org/thread/69mwjso51kzkrv5xgdmw04d9wngbg8br Many problems with cherry-picking arise because cherry-picks occur in the wrong order, or dependent changes are not picked. Some dependent changes shouldn't be picked since when we have made bug fixes in the master branch, it can already contain changes for new features that shouldn't be applied to maintenance branches. In those cases a backport of the fix is needed and the original developer of the PR might not be available to do this and there could be a significant delay for the release if delivering the backport takes time. When cherry-picking and backporting is delegated to other developers, in addition to delays, it can lead to coordination problems and commits being picked and applied in an order that results in even more merge conflicts. Thankfully, this isn't usually too painful, but it does happen once in a while. A few days ago, I began working on improving the documentation of the current process. I have added a section where I share some thoughts and a tool to prevent future problems. You can find the document here: https://pulsar.apache.org/contribute/release-process/#cherry-picking-changes-scheduled-for-the-release. However, this does not fully describe the current process and will only help to some extent. The added section should help prevent cherry-picking in the wrong order, but it still has many gaps. Many developers do not have proper merge conflict resolution tools configured. Without proper 3-way diff visualization and merge tools, it's very difficult to resolve many of the merge conflicts without making mistakes. This also requires a deep understanding of the module where the conflicts occur. After we have made the next set of maintenance releases, I plan to propose an alternative to the cherry-picking process that will address the main issues that the Apache Pulsar project has been struggling with every time we do releases. The alternative would be to designate the LTS branch as the default branch, make bug fixes primarily in the LTS branch, merge fixes to newer branches, and cherry-pick to possible older branches. This common approach in many projects leverages what Git does well: handling development across multiple branches. This solution ensures that our LTS branch is always immediately in a releasable state and the branch will also become the most stable version of Pulsar since bug fixes are continuously evaluated and integrated into the LTS branch with our CI where bug fix PRs are targeted to the LTS branch. Stability was the original goal of PIP-175 where the LTS concept was introduced to Pulsar. I hope that our community would be open to making changes to the maintenance strategy to help resolve the pain that we have to deal with each time we make releases. Sometimes, this "cherry-picking vs. merging branches" discussion becomes a "tabs vs. spaces" type of pointless discussion where personal preferences are emphasized. I hope that we can avoid that and admit the fact that releasing Apache Pulsar LTS with this cherry-picking process is a pain and we must fix it to make progress as a development community. -Lari