Hi Micheal > Penghui, is your current plan to create branch-2.10, create the release artifacts, and start a vote on them all within a few days of each other?
Yes, I will create branch-2.10 today. For starting the vote, we need to confirm these 2 PRs[1] will not introduce breaking changes. Very grateful if someone can also help verify them. [1] https://github.com/apache/pulsar/pull/13376, https://github.com/apache/pulsar/pull/13341 Thanks, Penghui On Thu, Feb 17, 2022 at 8:59 AM Matteo Merli <matteo.me...@gmail.com> wrote: > Yes, but I think that the code freeze is only meaningful if it’s > communicated in advance. > > The fact that it was included in the original PIP but never followed in the > practice means it would be a last minute change. > > On Wed, Feb 16, 2022 at 2:37 PM Michael Marshall <mmarsh...@apache.org> > wrote: > > > When we discussed the code freeze in the community meeting on 2/3, I > > was under the impression that it was a new development to our existing > > release process. I subsequently learned it was already defined in > > PIP 47. Even if we haven't been following this part of PIP 47, what > > is the value in waiting until 2.11 to follow our already defined process? > > While I agree it is helpful to provide guidance on when a version will > > ship, > > I think it is more important to give the community time to test a > release, > > even if that means we're a little late on our release schedule. So far, > > we haven't even created a branch to begin testing. > > > > Note also that Sijie suggested using a feature freeze early on in this > > thread. > > > > The 2.9.0 release is relevant here. It had 4 release candidates over 4 > > weeks and the final result was broken. That indicates to me that tagging > > an RC early does not guarantee an early release and that our current > > process isn't optimal and likely needs adjustments. I do not think we > > should wait to address these issues. I propose we start following > > PIP 47's guidance on code freeze and release stabilization periods. > > > > > I don't think that changes the picture here. There are *always* last > > > minute issues being discovered, and there is a call to be made on a > > > case by case. The feature freeze will reduce the likelihood of > > > introducing *more* issues by getting it from the master branch, but > > > won't change a comma from issues that were already there. > > > > I thought you wanted to implement a code/feature freeze to allow for > > more release stabilization. Can you clarify what you mean here? > > > > Thanks, > > Michael > > > > > > > > > > > > > > > > On Wed, Feb 16, 2022 at 2:42 PM Matteo Merli <matteo.me...@gmail.com> > > wrote: > > > > > > Michael, as we chatted in last weekly meeting (though not yet > > > formalized), since we have never really done a feature freeze on the > > > branch during paste releases, we should start from the next release, > > > to give a decent preview of what to expect to developers in terms of > > > dates. > > > > > > > While some may feel "behind" in getting out the 2.10 release, our > > > > priority must be to give the community time to verify the stability > of > > > > the release. > > > > > > I don't think that changes the picture here. There are *always* last > > > minute issues being discovered, and there is a call to be made on a > > > case by case. The feature freeze will reduce the likelihood of > > > introducing *more* issues by getting it from the master branch, but > > > won't change a comma from issues that were already there. > > > > > > > > > > > > > > > -- > > > Matteo Merli > > > <matteo.me...@gmail.com> > > > > > > On Wed, Feb 16, 2022 at 10:47 AM Michael Marshall < > mmarsh...@apache.org> > > wrote: > > > > > > > > > I will build the release and start the vote before next > Monday(GMT+8) > > > > > > > > Penghui, is your current plan to create branch-2.10, create the > > > > release artifacts, and start a vote on them all within a few days of > > > > each other? > > > > > > > > > I'm doing my best to follow PIP 47, but when seeing a potential > break > > > > > change, I need to confirm it. > > > > > After all the potential break changes have been confirmed and > fixed, > > I will > > > > > start the vote thread. > > > > > > > > I think we should review our current release plan before we move > > > > forward as proposed above. PIP 47 explicitly says that a month before > > > > the release date, the release manager will cut branches [0]. We don't > > > > yet have a `branch-2.10`. PIP 47 also defines a period of time for a > > > > feature freeze and then a code freeze. We have not yet had either. > > > > > > > > I propose we create branch-2.10 now and simultaneously announce that > > > > we are past the feature freeze period. Then, we can start the 2 week > > > > period for bug fixes that precedes the code freeze, as PIP 47 > > > > prescribes. Then, in two weeks, we can produce the first release > > > > candidate (also in PIP 47). > > > > > > > > While some may feel "behind" in getting out the 2.10 release, our > > > > priority must be to give the community time to verify the stability > of > > > > the release. > > > > > > > > Thanks, > > > > Michael > > > > > > > > [0] > > https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan > > > > > > > > > > > > > > > > > > > > On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <peng...@apache.org> > wrote: > > > > > > > > > > Hi all > > > > > > > > > > Just put an update here. > > > > > > > > > > We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and > > > > > https://github.com/apache/pulsar/pull/13341 > > > > > need to do the final verification, and you are also very welcome to > > verify > > > > > these 2 changes in your environment, cases. > > > > > > > > > > I will build the release and start the vote before next > Monday(GMT+8) > > > > > > > > > > Regards > > > > > Penghui > > > > > > > > > > On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <peng...@apache.org> > > wrote: > > > > > > > > > > > Hi lari, > > > > > > > > > > > > > So finally, I understand that "the problem" is that all HTTP > > server > > > > > > threads are blocked and this makes the Pulsar Admin API > > unavailable. > > > > > > > > > > > > To support the blocking servlet API, Jetty uses a default thread > > pool that > > > > > > can grow to up to 200 threads ( > > > > > > > > > https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57 > > ) > > > > > > . > > > > > > However this default of 200 maximum threads is not used in > Pulsar. > > > > > > > > > > > > Regarding the "make async" changes, It is an optimization to > > migrate from > > > > > > the blocking servlet api to the asynchronous servlet api. This > > work isn't > > > > > > urgent since we can simply mitigate the HTTP server threads > > getting blocked > > > > > > by setting "numHttpServerThreads=200" in broker.conf. "the > > problem" will be > > > > > > resolved immediately without risks of regressions that are > > involved in > > > > > > making the sync -> async changes. > > > > > > > > > > > > Yes, this is the problem. But I am against using 200 threads as > > the max > > > > > > web server thread by default, > > > > > > it can't work for cases that the broker without that much memory, > > it will > > > > > > lead to more serious problems > > > > > > that the service quality of messaging API gets worse due to the > JVM > > > > > > GC, even memory overflow. > > > > > > > > > > > > Yes, it isn't urgent. So I said it's not a blocker for the 2.10 > > release, > > > > > > and all the PRs are not cherry-picked to branch-2.x > > > > > > This is an optimization for pulsar, the current implementation > > does not > > > > > > use jetty async API well, we should fix it, > > > > > > we should reduce the code with bad smells, and using async API is > > also > > > > > > a more efficient way without opening such jetty threads. > > > > > > Do you have any concerns about the way the modification becomes > > purely > > > > > > async? > > > > > > > > > > > > > Penghui, would you mind adding a GitHub issue for the problem > > where all > > > > > > HTTP threads get blocked and the Pulsar Admin API stops > responding? > > > > > > > > > > > > https://github.com/apache/pulsar/issues/4756 the attachment from > > the > > > > > > issue is a good example > > > > > > > > > > > > Regards, > > > > > > Penghui > > > > > > > > > > > > > > > > > > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lhot...@apache.org> > > wrote: > > > > > > > > > > > >> I created PR https://github.com/apache/pulsar/pull/14320 to set > > > > > >> numHttpServerThreads=200 . > > > > > >> Please review > > > > > >> > > > > > >> On 2022/02/16 12:39:34 Lari Hotari wrote: > > > > > >> > On 2022/02/16 00:58:20 PengHui Li wrote: > > > > > >> > > Which is a sync method. Ultimately this could lead to all > the > > > > > >> pulsar-web > > > > > >> > > thread > > > > > >> > > blocked. we'd better not introduce blocking calls if we use > > > > > >> AsyncResponse. > > > > > >> > > > > > > > >> > > > What issue did you see? Please share more context. Thanks > > for the > > > > > >> > > patience. > > > > > >> > > > > > > > >> > > It happened very earlier > > > > > >> > > > > > > > >> > > Here is the issue > > https://github.com/apache/pulsar/issues/4756 > > > > > >> > > And here is also a related fix > > > > > >> https://github.com/apache/pulsar/pull/10619 > > > > > >> > > > > > > >> > Penghui, Thank you for the patience, and thanks for sharing > more > > > > > >> context. I happened to send a reply before reading your message, > > so please > > > > > >> bear with me. > > > > > >> > > > > > > >> > So finally, I understand that "the problem" is that all HTTP > > server > > > > > >> threads are blocked and this makes the Pulsar Admin API > > unavailable. > > > > > >> > > > > > > >> > To support the blocking servlet API, Jetty uses a default > > thread pool > > > > > >> that can grow to up to 200 threads ( > > > > > >> > > > https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57 > > ) > > > > > >> . > > > > > >> > However this default of 200 maximum threads is not used in > > Pulsar. > > > > > >> > > > > > > >> > The problem is that Pulsar uses a low value that assumes > > asynchronous > > > > > >> API usage: > > > > > >> > > > > > > >> > > > https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204 > > > > > >> > Pulsar should be using a high value (for example 200) as long > > as there > > > > > >> are blocking calls in Admin APIs. > > > > > >> > > > > > > >> > The mitigation to the issue of all HTTP server threads getting > > blocked > > > > > >> is setting "numHttpServerThreads=200" in broker.conf. > > > > > >> > > > > > > >> > Regarding the "make async" changes, It is an optimization to > > migrate > > > > > >> from the blocking servlet api to the asynchronous servlet api. > > This work > > > > > >> isn't urgent since we can simply mitigate the HTTP server > threads > > getting > > > > > >> blocked by setting "numHttpServerThreads=200" in broker.conf. > > "the problem" > > > > > >> will be resolved immediately without risks of regressions that > > are involved > > > > > >> in making the sync -> async changes. > > > > > >> > > > > > > >> > Penghui, would you mind adding a GitHub issue for the problem > > where all > > > > > >> HTTP threads get blocked and the Pulsar Admin API stops > > responding? > > > > > >> > > > > > > >> > I can follow up with a PR which updates the default for > > > > > >> numHttpServerThreads to 200. This is a maximum value and Jetty > > starts with > > > > > >> 8 threads. We can agree on the default value to use in the PR. > > > > > >> > > > > > > >> > Thank you for the great collaboration on sharing the context > and > > > > > >> describing the problem patiently. > > > > > >> > > > > > > >> > BR, > > > > > >> > > > > > > >> > -Lari > > > > > >> > > > > > > >> > > > > > > > > > -- > -- > Matteo Merli > <matteo.me...@gmail.com> >