Hi all,

Put an update here, I have created branch-2.10[1]

[1]https://github.com/apache/pulsar/tree/branch-2.10

On Thu, Feb 17, 2022 at 9:23 AM PengHui Li <peng...@apache.org> wrote:

> Hi Micheal
>
> > Penghui, is your current plan to create branch-2.10, create the
> release artifacts, and start a vote on them all within a few days of
> each other?
>
> Yes, I will create branch-2.10 today.
>
> For starting the vote, we need to confirm these 2 PRs[1] will not introduce
> breaking changes. Very grateful if someone can also help verify them.
>
>
> [1] https://github.com/apache/pulsar/pull/13376,
> https://github.com/apache/pulsar/pull/13341
>
> Thanks,
> Penghui
>
> On Thu, Feb 17, 2022 at 8:59 AM Matteo Merli <matteo.me...@gmail.com>
> wrote:
>
>> Yes, but I think that the code freeze is only meaningful if it’s
>> communicated in advance.
>>
>> The fact that it was included in the original PIP but never followed in
>> the
>> practice means it would be a last minute change.
>>
>> On Wed, Feb 16, 2022 at 2:37 PM Michael Marshall <mmarsh...@apache.org>
>> wrote:
>>
>> > When we discussed the code freeze in the community meeting on 2/3, I
>> > was under the impression that it was a new development to our existing
>> > release process. I subsequently learned it was already defined in
>> > PIP 47. Even if we haven't been following this part of PIP 47, what
>> > is the value in waiting until 2.11 to follow our already defined
>> process?
>> > While I agree it is helpful to provide guidance on when a version will
>> > ship,
>> > I think it is more important to give the community time to test a
>> release,
>> > even if that means we're a little late on our release schedule. So far,
>> > we haven't even created a branch to begin testing.
>> >
>> > Note also that Sijie suggested using a feature freeze early on in this
>> > thread.
>> >
>> > The 2.9.0 release is relevant here. It had 4 release candidates over 4
>> > weeks and the final result was broken. That indicates to me that tagging
>> > an RC early does not guarantee an early release and that our current
>> > process isn't optimal and likely needs adjustments. I do not think we
>> > should wait to address these issues. I propose we start following
>> > PIP 47's guidance on code freeze and release stabilization periods.
>> >
>> > > I don't think that changes the picture here. There are *always* last
>> > > minute issues being discovered, and there is a call to be made on a
>> > > case by case. The feature freeze will reduce the likelihood of
>> > > introducing *more* issues by getting it from the master branch, but
>> > > won't change a comma from issues that were already there.
>> >
>> > I thought you wanted to implement a code/feature freeze to allow for
>> > more release stabilization. Can you clarify what you mean here?
>> >
>> > Thanks,
>> > Michael
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Feb 16, 2022 at 2:42 PM Matteo Merli <matteo.me...@gmail.com>
>> > wrote:
>> > >
>> > > Michael, as we chatted in last weekly meeting (though not yet
>> > > formalized), since we have never really done a feature freeze on the
>> > > branch during paste releases, we should start from the next release,
>> > > to give a decent preview of what to expect to developers in terms of
>> > > dates.
>> > >
>> > > > While some may feel "behind" in getting out the 2.10 release, our
>> > > > priority must be to give the community time to verify the stability
>> of
>> > > > the release.
>> > >
>> > > I don't think that changes the picture here. There are *always* last
>> > > minute issues being discovered, and there is a call to be made on a
>> > > case by case. The feature freeze will reduce the likelihood of
>> > > introducing *more* issues by getting it from the master branch, but
>> > > won't change a comma from issues that were already there.
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > Matteo Merli
>> > > <matteo.me...@gmail.com>
>> > >
>> > > On Wed, Feb 16, 2022 at 10:47 AM Michael Marshall <
>> mmarsh...@apache.org>
>> > wrote:
>> > > >
>> > > > > I will build the release and start the vote before next
>> Monday(GMT+8)
>> > > >
>> > > > Penghui, is your current plan to create branch-2.10, create the
>> > > > release artifacts, and start a vote on them all within a few days of
>> > > > each other?
>> > > >
>> > > > > I'm doing my best to follow PIP 47, but when seeing a potential
>> break
>> > > > > change, I need to confirm it.
>> > > > > After all the potential break changes have been confirmed and
>> fixed,
>> > I will
>> > > > > start the vote thread.
>> > > >
>> > > > I think we should review our current release plan before we move
>> > > > forward as proposed above. PIP 47 explicitly says that a month
>> before
>> > > > the release date, the release manager will cut branches [0]. We
>> don't
>> > > > yet have a `branch-2.10`. PIP 47 also defines a period of time for a
>> > > > feature freeze and then a code freeze. We have not yet had either.
>> > > >
>> > > > I propose we create branch-2.10 now and simultaneously announce that
>> > > > we are past the feature freeze period. Then, we can start the 2 week
>> > > > period for bug fixes that precedes the code freeze, as PIP 47
>> > > > prescribes. Then, in two weeks, we can produce the first release
>> > > > candidate (also in PIP 47).
>> > > >
>> > > > While some may feel "behind" in getting out the 2.10 release, our
>> > > > priority must be to give the community time to verify the stability
>> of
>> > > > the release.
>> > > >
>> > > > Thanks,
>> > > > Michael
>> > > >
>> > > > [0]
>> > https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <peng...@apache.org>
>> wrote:
>> > > > >
>> > > > > Hi all
>> > > > >
>> > > > > Just put an update here.
>> > > > >
>> > > > > We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
>> > > > > https://github.com/apache/pulsar/pull/13341
>> > > > > need to do the final verification, and you are also very welcome
>> to
>> > verify
>> > > > > these 2 changes in your environment, cases.
>> > > > >
>> > > > > I will build the release and start the vote before next
>> Monday(GMT+8)
>> > > > >
>> > > > > Regards
>> > > > > Penghui
>> > > > >
>> > > > > On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <peng...@apache.org>
>> > wrote:
>> > > > >
>> > > > > > Hi lari,
>> > > > > >
>> > > > > > > So finally, I understand that "the problem" is that all HTTP
>> > server
>> > > > > > threads are blocked and this makes the Pulsar Admin API
>> > unavailable.
>> > > > > >
>> > > > > > To support the blocking servlet API, Jetty uses a default thread
>> > pool that
>> > > > > > can grow to up to 200 threads (
>> > > > > >
>> >
>> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
>> > )
>> > > > > > .
>> > > > > > However this default of 200 maximum threads is not used in
>> Pulsar.
>> > > > > >
>> > > > > >  Regarding the "make async" changes, It is an optimization to
>> > migrate from
>> > > > > > the blocking servlet api to the asynchronous servlet api. This
>> > work isn't
>> > > > > > urgent since we can simply mitigate the HTTP server threads
>> > getting blocked
>> > > > > > by setting "numHttpServerThreads=200" in broker.conf. "the
>> > problem" will be
>> > > > > > resolved immediately without risks of regressions that are
>> > involved in
>> > > > > > making the sync -> async changes.
>> > > > > >
>> > > > > > Yes, this is the problem. But I am against using 200 threads as
>> > the max
>> > > > > > web server thread by default,
>> > > > > > it can't work for cases that the broker without that much
>> memory,
>> > it will
>> > > > > > lead to more serious problems
>> > > > > > that the service quality of messaging API gets worse due to the
>> JVM
>> > > > > > GC, even memory overflow.
>> > > > > >
>> > > > > > Yes, it isn't urgent. So I said it's not a blocker for the 2.10
>> > release,
>> > > > > > and all the PRs are not cherry-picked to branch-2.x
>> > > > > > This is an optimization for pulsar, the current implementation
>> > does not
>> > > > > > use jetty async API well, we should fix it,
>> > > > > > we should reduce the code with bad smells, and using async API
>> is
>> > also
>> > > > > > a more efficient way without opening such jetty threads.
>> > > > > > Do you have any concerns about the way the modification becomes
>> > purely
>> > > > > > async?
>> > > > > >
>> > > > > > > Penghui, would you mind adding a GitHub issue for the problem
>> > where all
>> > > > > > HTTP threads get blocked and the Pulsar Admin API stops
>> responding?
>> > > > > >
>> > > > > > https://github.com/apache/pulsar/issues/4756 the attachment
>> from
>> > the
>> > > > > > issue is a good example
>> > > > > >
>> > > > > > Regards,
>> > > > > > Penghui
>> > > > > >
>> > > > > >
>> > > > > > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lhot...@apache.org
>> >
>> > wrote:
>> > > > > >
>> > > > > >> I created PR https://github.com/apache/pulsar/pull/14320 to
>> set
>> > > > > >> numHttpServerThreads=200 .
>> > > > > >> Please review
>> > > > > >>
>> > > > > >> On 2022/02/16 12:39:34 Lari Hotari wrote:
>> > > > > >> > On 2022/02/16 00:58:20 PengHui Li wrote:
>> > > > > >> > > Which is a sync method. Ultimately this could lead to all
>> the
>> > > > > >> pulsar-web
>> > > > > >> > > thread
>> > > > > >> > > blocked. we'd better not introduce blocking calls if we use
>> > > > > >> AsyncResponse.
>> > > > > >> > >
>> > > > > >> > > > What issue did you see? Please share more context. Thanks
>> > for the
>> > > > > >> > > patience.
>> > > > > >> > >
>> > > > > >> > > It happened very earlier
>> > > > > >> > >
>> > > > > >> > > Here is the issue
>> > https://github.com/apache/pulsar/issues/4756
>> > > > > >> > > And here is also a related fix
>> > > > > >> https://github.com/apache/pulsar/pull/10619
>> > > > > >> >
>> > > > > >> > Penghui, Thank you for the patience, and thanks for sharing
>> more
>> > > > > >> context. I happened to send a reply before reading your
>> message,
>> > so please
>> > > > > >> bear with me.
>> > > > > >> >
>> > > > > >> > So finally, I understand that "the problem" is that all HTTP
>> > server
>> > > > > >> threads are blocked and this makes the Pulsar Admin API
>> > unavailable.
>> > > > > >> >
>> > > > > >> > To support the blocking servlet API, Jetty uses a default
>> > thread pool
>> > > > > >> that can grow to up to 200 threads (
>> > > > > >>
>> >
>> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
>> > )
>> > > > > >> .
>> > > > > >> > However this default of 200 maximum threads is not used in
>> > Pulsar.
>> > > > > >> >
>> > > > > >> > The problem is that Pulsar uses a low value that assumes
>> > asynchronous
>> > > > > >> API usage:
>> > > > > >> >
>> > > > > >>
>> >
>> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
>> > > > > >> > Pulsar should be using a high value (for example 200) as long
>> > as there
>> > > > > >> are blocking calls in Admin APIs.
>> > > > > >> >
>> > > > > >> > The mitigation to the issue of all HTTP server threads
>> getting
>> > blocked
>> > > > > >> is setting "numHttpServerThreads=200" in broker.conf.
>> > > > > >> >
>> > > > > >> > Regarding the "make async" changes, It is an optimization to
>> > migrate
>> > > > > >> from the blocking servlet api to the asynchronous servlet api.
>> > This work
>> > > > > >> isn't urgent since we can simply mitigate the HTTP server
>> threads
>> > getting
>> > > > > >> blocked by setting "numHttpServerThreads=200" in broker.conf.
>> > "the problem"
>> > > > > >> will be resolved immediately without risks of regressions that
>> > are involved
>> > > > > >> in making the sync -> async changes.
>> > > > > >> >
>> > > > > >> > Penghui, would you mind adding a GitHub issue for the problem
>> > where all
>> > > > > >> HTTP threads get blocked and the Pulsar Admin API stops
>> > responding?
>> > > > > >> >
>> > > > > >> > I can follow up with a PR which updates the default for
>> > > > > >> numHttpServerThreads to 200. This is a maximum value and Jetty
>> > starts with
>> > > > > >> 8 threads. We can agree on the default value to use in the PR.
>> > > > > >> >
>> > > > > >> > Thank you for the great collaboration on sharing the context
>> and
>> > > > > >> describing the problem patiently.
>> > > > > >> >
>> > > > > >> > BR,
>> > > > > >> >
>> > > > > >> > -Lari
>> > > > > >> >
>> > > > > >>
>> > > > > >
>> >
>> --
>> --
>> Matteo Merli
>> <matteo.me...@gmail.com>
>>
>

Reply via email to