Thank you for your offer, Enrico.

> If you are now blocked, then other users will be blocked as well.

I looked into this more today, and I believe that I am blocked on using
2.7.1. I configured prometheus's server side filtering for the high
cardinality metrics, but my prometheus instance is still getting OOMKilled
due to the collective size of the metrics payload returned by my brokers.
In my use case, I encountered problems with around 40k topics each with a
single subscription. For reference, I ran the same load against
2.7.0 brokers and had no issues with my prometheus instance.

Sijie,

Thanks for your reply.

> The bugfix releases are usually made monthly based on demand. We can
> probably wait 1~2 weeks to see if there are any other fixes to include
> before cutting a 2.7.2 release. Does that make sense?

Are there known bug fixes that you are looking to get merged in the next 1
or 2 weeks?

I agree with the general timeline of doing bug fix releases monthly based
on demand. I also think there should be room for extraordinary
circumstances where we should release early to fix an issue that impacts
many users. Given Pulsar's advertised ability to handle up to a million
topics, I think this is such a situation. Let me know what you think.

Thanks,
Michael

On Wed, Mar 31, 2021 at 6:31 PM Sijie Guo <guosi...@gmail.com> wrote:

> Michael,
>
> The bugfix releases are usually made monthly based on demand. We can
> probably wait 1~2 weeks to see if there are any other fixes to include
> before cutting a 2.7.2 release. Does that make sense?
>
> Thanks,
> Sijie
>
> On Tue, Mar 30, 2021 at 9:55 PM Michael Marshall <mikemars...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I propose and request that we release version 2.7.2 to fix a regression
> > introduced in 2.7.1.
> >
> > Pulsar 2.7.1 introduced cursor level metrics without including the
> ability
> > to disable them (https://github.com/apache/pulsar/pull/9618). I recently
> > discovered the metrics when I created a Pulsar 2.7.1 cluster, created
> > thousands of topics and subscriptions, and then started to have problems
> > with my prometheus instance because of an influx of metrics. The fix to
> > make these metrics optional and disabled by default has already been
> merged
> > to the "branch-2.7" branch (https://github.com/apache/pulsar/pull/9814).
> >
> > Given the cardinality of the metrics produced for every cursor and the
> fact
> > that Pulsar is supposed to handle many topics and subscriptions with
> ease,
> > I consider the creation of too many metrics a regression, and I think it
> is
> > important to release a new, latest version.
> >
> > Further, 2.7.1 included several important bug fixes (e.g. one to fix
> tiered
> > storage to AWS S3), so I would prefer to move forward instead of back to
> > 2.7.0.
> >
> > What do others think about cutting a 2.7.2 release now? Do others agree
> > that creating metrics for every cursor should be considered a regression?
> > If not, does the community have a helpful guide to determine what should
> be
> > considered a regression?
> >
> > Before writing this email, I consulted PIP 47, Pulsar's time based
> release
> > plan. (
> > https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan
> ).
> > The PIP mentions that there will be bug fix releases for the last 4
> > releases, but it doesn't mention a cadence.
> >
> > Tangentially, I am wondering why the 2.7.1 release wasn't held up to
> > include this configuration fix. PR 9814 was submitted before the 2.7.1
> tag
> > was created and was merged just 2 days after the tag's creation. What are
> > the criteria for holding up a release?
> >
> > Thanks for considering my request, and thanks for any feedback you can
> > provide.
> >
> > Best,
> > Michael Marshall
> >
>

Reply via email to