On Wed, 26 Feb 2025 at 09:47, Yunze Xu <x...@apache.org> wrote: > > Creating a PIP could emphasize that new configs are added and explain > why these configs are introduced. While with a regular PR, especially > for many recent PRs that have thousands lines of changes, the > configuration changes are easy to be ignored. Your question can be > applied for all changes required by a PIP: what's the problem for > changes to metrics, wire protocol APIs to users?
I agree that's a fair point. PIPs might not be the best way to capture the purpose of each configuration setting and when they are added. In many bug fix cases, a PIP was added long ago, but configurables are added later when it's noticed that fixing a bug requires a different solution which introduces a new configurable setting. It would be better to maintain configuration setting documentation in the codebase through annotations for the configuration values, as we already do. The current doc annotation is used for generating documentation visible at https://pulsar.apache.org/reference/#/4.0.x/config/reference-configuration-broker. The problem is that all config is in one basket, and the configuration "audience" categorization I mentioned in my previous email is missing. The annotation could contain a longer field for detailed reasoning or references. Perhaps that could be in the javadoc. How many of us really look at the PIPs each time we're working on the codebase? The source of truth for configuration should be in the codebase. The website could be improved to display this information in different ways. The existing https://pulsar.apache.org/reference/#/4.0.x/config/reference-configuration-broker is a good starting point. How many people are even aware that this exists? > If the rule is too strict, let's change the rule. > If the rule is ambiguous, let's make it clear or leave an additional > note for special cases. One possible way to clarify things is differentiating the level of a rule with SHOULD and MUST, as done in the ASF release policy, https://www.apache.org/legal/release-policy.html#release-approval. For example, "For a release vote to pass, a minimum of three positive binding votes and more positive binding votes than negative binding votes MUST be cast," but there's "Release votes SHOULD remain open for at least 72 hours." In the Pulsar PIP process, I think everything is a SHOULD. Assuming good intentions and trusting each other will lead to good results. Some flexibility in rule-following is necessary to deliver value to our users. With MUST and SHOULD classifications, we can clearly indicate which rules allow for exceptions when needed. While ASF is often criticized for rigidity, it actually imposes relatively few hard requirements on projects. I believe the real challenges lie in the release policy and lack of tooling for automated releases. Once those are addressed, most issues resolve themselves. Adding unnecessary rigidity to Apache Pulsar would hinder the project's ability to evolve and remain valuable to our users. Rules should be designed with some flexibility built in. This approach may seem counterintuitive at first, but it's a proven way to empower contributors to achieve great results without the paralysis that comes from fear of breaking rules. Creating psychological safety is essential for team success. Research from Google [1, 2] shows that teams thrive when members feel safe to take risks, speak their minds, and be vulnerable without fear of punishment. By making it acceptable to occasionally deviate from rules with good reason, we remove the barriers that hold us back from innovation and enjoyment in our work. My hope is that Apache Pulsar can be an environment where innovation flourishes and contributors find satisfaction in their work. The natural outcome will be delivering exceptional value to our users. -Lari 1 - https://rework.withgoogle.com/en/guides/understanding-team-effectiveness 2 - https://www.nytimes.com/2016/02/28/magazine/what-google-learned-from-its-quest-to-build-the-perfect-team.html?smid=pl-share > > Another important point is that we made a rule but didn't follow the > rule. It harms the community especially for new contributors. For > example, IIRC, there were debates for some metrics related PRs > cherry-picked into the LTS branch many times. > > If the rule is too strict, let's change the rule. > If the rule is ambiguous, let's make it clear or leave an additional > note for special cases. > > Thanks, > Yunze > > On Wed, Feb 26, 2025 at 3:24 PM Lari Hotari <lhot...@apache.org> wrote: > > > > For some of the added configurations, you are right that PIPs are needed. > > > > Most configurables are settings for internal implementation details > > that don't need to be changed. The reason why configurable options are > > added is that it's useful if the value can be modified when a user > > faces an issue that could be solved by tuning the limit. That's > > something that could be hard to anticipate when a change (either a bug > > fix or new feature) is made. > > > > For example, PR #23901 introduced > > managedLedgerMaxReadsInFlightPermitsAcquireTimeoutMillis and > > managedLedgerMaxReadsInFlightPermitsAcquireQueueSize. The original > > managedLedgerMaxReadsInFlightSizeInMB feature contained a gap where > > the solution would never work properly when the system was under heavy > > load and hit the limit. That's the whole point of the > > managedLedgerMaxReadsInFlightSizeInMB solution—it adds a limit. If the > > limit doesn't work, that's a bug. It's not a new feature when that > > problem is fixed. > > > > The configs will get documented in Pulsar documentation since our > > website will generate documentation based on the annotations in the > > ServiceConfiguration file. > > > > What is the problem that these new configs have caused for our users? > > How would creating a PIP improve the situation for our users? > > > > -Lari > > > > On Wed, 26 Feb 2025 at 06:59, Yunze Xu <x...@apache.org> wrote: > > > > > > The changes to broker configs can be shown by the following command: > > > > > > ``` > > > git log --pretty=oneline > > > pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java > > > ``` > > > > > > Then I found some other PRs. > > > > > > https://github.com/apache/pulsar/pull/23709 adds an > > > `allowAclChangesOnNonExistentTopics` config. However, from the comment > > > https://github.com/apache/pulsar/pull/23709#issuecomment-2533780608, > > > it's addressing a breaking change in > > > https://github.com/apache/pulsar/pull/22547. Yeah, it happens again > > > that 22547 is treated as a "bug fix" so the breaking change is > > > acceptable for LTS releases. > > > > > > https://github.com/apache/pulsar/pull/22792 adds a new config > > > `managedLedgerCursorResetLedgerCloseTimestampMaxClockSkewMillis`. It's > > > cherry-picked to branch-4.0 > > > > > > https://github.com/apache/pulsar/pull/20147 adds a new config. It's > > > not cherry-picked to branch-3.0 but only as a part of the new 4.0 LTS > > > release. > > > > > > ---- > > > > > > A solution I can think of is to have an "unstable" config in addition > > > to the existing config. The standard for which configuration change > > > requires a PIP is really ambiguous. Pulsar's configuration now becomes > > > really big and complicated, with many `xxxNumThreads`, `xxxTimeoutMs` > > > configs that I cannot think of a reason in which case they should be > > > configured. > > > > > > Thanks, > > > Yunze > > > > > > > > > On Wed, Feb 26, 2025 at 12:42 PM Yunze Xu <x...@apache.org> wrote: > > > > > > > > Hi all, > > > > > > > > I noticed two PRs were merged recently and cherry-picked into > > > > branch-3.0 and branch-4.0, where 3.0 and 4.0 are LTS releases. > > > > > > > > https://github.com/apache/pulsar/pull/24012 fixes a bug for the > > > > implementation of PIP-322. However, it adds a new configuration > > > > without a new PIP or change on the existing PIP. It definitely > > > > violates "When is a PIP required" [1]. (Any change to the > > > > configuration) > > > > > > > > https://github.com/apache/pulsar/pull/23697 fixes a message loss issue > > > > caused by geo-replication by introducing a new feature flag > > > > `supports_repl_dedup_by_lid_and_eid`. It also violates [1] (Any new > > > > feature for Pulsar brokers or client) > > > > > > > > I also noticed https://github.com/apache/pulsar/pull/24025, which adds > > > > a new config `managedLedgerOffloadReadThreads` as well. Though it's > > > > only cherry-picked into branch-3.3 but not for LTS releases (3.0 and > > > > 4.0) > > > > > > > > I found the definition of "bug-fixes" ambiguous. It seems like a > > > > silver bullet that if we want a PR to be included in the LTS release, > > > > we only need to say, "it's a bug fix", no matter if it changes > > > > anything that requires a PIP. > > > > > > > > Based on the fact that the PIP rule is not widely followed, should we > > > > update and loose the requirement? > > > > > > > > [1] > > > > https://github.com/apache/pulsar/tree/master/pip#when-is-a-pip-required > > > > > > > > Thanks, > > > > Yunze