Re: [DISCUSS] Apache Pulsar 2.10.0 release

Matteo Merli Wed, 16 Feb 2022 12:42:56 -0800

Michael, as we chatted in last weekly meeting (though not yet
formalized), since we have never really done a feature freeze on the
branch during paste releases, we should start from the next release,
to give a decent preview of what to expect to developers in terms of
dates.


> While some may feel "behind" in getting out the 2.10 release, our
> priority must be to give the community time to verify the stability of
> the release.

I don't think that changes the picture here. There are *always* last
minute issues being discovered, and there is a call to be made on a
case by case. The feature freeze will reduce the likelihood of
introducing *more* issues by getting it from the master branch, but
won't change a comma from issues that were already there.




--
Matteo Merli
<[email protected]>

On Wed, Feb 16, 2022 at 10:47 AM Michael Marshall <[email protected]> wrote:
>
> > I will build the release and start the vote before next Monday(GMT+8)
>
> Penghui, is your current plan to create branch-2.10, create the
> release artifacts, and start a vote on them all within a few days of
> each other?
>
> > I'm doing my best to follow PIP 47, but when seeing a potential break
> > change, I need to confirm it.
> > After all the potential break changes have been confirmed and fixed, I will
> > start the vote thread.
>
> I think we should review our current release plan before we move
> forward as proposed above. PIP 47 explicitly says that a month before
> the release date, the release manager will cut branches [0]. We don't
> yet have a `branch-2.10`. PIP 47 also defines a period of time for a
> feature freeze and then a code freeze. We have not yet had either.
>
> I propose we create branch-2.10 now and simultaneously announce that
> we are past the feature freeze period. Then, we can start the 2 week
> period for bug fixes that precedes the code freeze, as PIP 47
> prescribes. Then, in two weeks, we can produce the first release
> candidate (also in PIP 47).
>
> While some may feel "behind" in getting out the 2.10 release, our
> priority must be to give the community time to verify the stability of
> the release.
>
> Thanks,
> Michael
>
> [0] https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan
>
>
>
>
> On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <[email protected]> wrote:
> >
> > Hi all
> >
> > Just put an update here.
> >
> > We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
> > https://github.com/apache/pulsar/pull/13341
> > need to do the final verification, and you are also very welcome to verify
> > these 2 changes in your environment, cases.
> >
> > I will build the release and start the vote before next Monday(GMT+8)
> >
> > Regards
> > Penghui
> >
> > On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <[email protected]> wrote:
> >
> > > Hi lari,
> > >
> > > > So finally, I understand that "the problem" is that all HTTP server
> > > threads are blocked and this makes the Pulsar Admin API unavailable.
> > >
> > > To support the blocking servlet API, Jetty uses a default thread pool that
> > > can grow to up to 200 threads (
> > > https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> > > .
> > > However this default of 200 maximum threads is not used in Pulsar.
> > >
> > >  Regarding the "make async" changes, It is an optimization to migrate from
> > > the blocking servlet api to the asynchronous servlet api. This work isn't
> > > urgent since we can simply mitigate the HTTP server threads getting 
> > > blocked
> > > by setting "numHttpServerThreads=200" in broker.conf. "the problem" will 
> > > be
> > > resolved immediately without risks of regressions that are involved in
> > > making the sync -> async changes.
> > >
> > > Yes, this is the problem. But I am against using 200 threads as the max
> > > web server thread by default,
> > > it can't work for cases that the broker without that much memory, it will
> > > lead to more serious problems
> > > that the service quality of messaging API gets worse due to the JVM
> > > GC, even memory overflow.
> > >
> > > Yes, it isn't urgent. So I said it's not a blocker for the 2.10 release,
> > > and all the PRs are not cherry-picked to branch-2.x
> > > This is an optimization for pulsar, the current implementation does not
> > > use jetty async API well, we should fix it,
> > > we should reduce the code with bad smells, and using async API is also
> > > a more efficient way without opening such jetty threads.
> > > Do you have any concerns about the way the modification becomes purely
> > > async?
> > >
> > > > Penghui, would you mind adding a GitHub issue for the problem where all
> > > HTTP threads get blocked and the Pulsar Admin API stops responding?
> > >
> > > https://github.com/apache/pulsar/issues/4756 the attachment from the
> > > issue is a good example
> > >
> > > Regards,
> > > Penghui
> > >
> > >
> > > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <[email protected]> wrote:
> > >
> > >> I created PR https://github.com/apache/pulsar/pull/14320 to set
> > >> numHttpServerThreads=200 .
> > >> Please review
> > >>
> > >> On 2022/02/16 12:39:34 Lari Hotari wrote:
> > >> > On 2022/02/16 00:58:20 PengHui Li wrote:
> > >> > > Which is a sync method. Ultimately this could lead to all the
> > >> pulsar-web
> > >> > > thread
> > >> > > blocked. we'd better not introduce blocking calls if we use
> > >> AsyncResponse.
> > >> > >
> > >> > > > What issue did you see? Please share more context. Thanks for the
> > >> > > patience.
> > >> > >
> > >> > > It happened very earlier
> > >> > >
> > >> > > Here is the issue https://github.com/apache/pulsar/issues/4756
> > >> > > And here is also a related fix
> > >> https://github.com/apache/pulsar/pull/10619
> > >> >
> > >> > Penghui, Thank you for the patience, and thanks for sharing more
> > >> context. I happened to send a reply before reading your message, so 
> > >> please
> > >> bear with me.
> > >> >
> > >> > So finally, I understand that "the problem" is that all HTTP server
> > >> threads are blocked and this makes the Pulsar Admin API unavailable.
> > >> >
> > >> > To support the blocking servlet API, Jetty uses a default thread pool
> > >> that can grow to up to 200 threads (
> > >> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> > >> .
> > >> > However this default of 200 maximum threads is not used in Pulsar.
> > >> >
> > >> > The problem is that Pulsar uses a low value that assumes asynchronous
> > >> API usage:
> > >> >
> > >> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
> > >> > Pulsar should be using a high value (for example 200) as long as there
> > >> are blocking calls in Admin APIs.
> > >> >
> > >> > The mitigation to the issue of all HTTP server threads getting blocked
> > >> is setting "numHttpServerThreads=200" in broker.conf.
> > >> >
> > >> > Regarding the "make async" changes, It is an optimization to migrate
> > >> from the blocking servlet api to the asynchronous servlet api. This work
> > >> isn't urgent since we can simply mitigate the HTTP server threads getting
> > >> blocked by setting "numHttpServerThreads=200" in broker.conf. "the 
> > >> problem"
> > >> will be resolved immediately without risks of regressions that are 
> > >> involved
> > >> in making the sync -> async changes.
> > >> >
> > >> > Penghui, would you mind adding a GitHub issue for the problem where all
> > >> HTTP threads get blocked and the Pulsar Admin API stops responding?
> > >> >
> > >> > I can follow up with a PR which updates the default for
> > >> numHttpServerThreads to 200. This is a maximum value and Jetty starts 
> > >> with
> > >> 8 threads. We can agree on the default value to use in the PR.
> > >> >
> > >> > Thank you for the great collaboration on sharing the context and
> > >> describing the problem patiently.
> > >> >
> > >> > BR,
> > >> >
> > >> > -Lari
> > >> >
> > >>
> > >

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Reply via email to