Re: PIP 112: Generate Release Notes Automatically
Hi all, This is a follow-up to the last email. Previously, we use markdown files to create issue templates [1]. For the doc_request issue template, I've changed it to a customized issue form by adding a YAML form definition file, which is more intuitive and easy to use. Feel free to comment on this PR [2], thanks. [1] https://github.com/apache/pulsar/tree/master/.github/ISSUE_TEMPLATE [2] https://github.com/apache/pulsar/pull/13359 On Tue, Dec 14, 2021 at 9:09 PM Yu Liu wrote: > Spot on. > This also reminds me that we can create custom issue forms by adding YAML > form definition files [1], which is more user-friendly and easy to maintain. > > [1] > https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/syntax-for-issue-forms#about-yaml-syntax-for-issue-forms > > > On Tue, Dec 14, 2021 at 2:25 AM Michael Marshall > wrote: > >> +1 Yu, thank you for putting together this thorough document. This is >> a great initiative. >> >> I think it might help to review and possibly update the PR template as >> part of this PIP. For example, the current template does not prompt >> authors whether the PR should be mentioned in release notes. Such a >> prompt could help committers determine the right labels for a PR. >> >> Thanks, >> Michael >> >> On Mon, Dec 13, 2021 at 4:56 AM Li Li >> wrote: >> > >> > +1 >> > >> > Good idea, I think I can be part of this PIP after I finished upgrading >> pulsar website. >> > >> > Thanks, >> > LiLi >> > >> > > On Dec 13, 2021, at 4:18 PM, Yu wrote: >> > > >> > > Hi Pulsarers, >> > > >> > > As we know[1], there are some issues in the current Pulsar release >> notes >> > > (RN), for example: >> > > >> > > - For Pulsar users >> > > They cannot capture the highlights quickly since the RN is a raw dump >> of >> > > PRs. >> > > >> > > - For Pulsar release managers (RM) >> > > They feel overwhelmed by the **manual** workload of generating RN >> since it >> > > is created based on git commit messages, while many people do not >> provide >> > > clear and meaningful info. >> > > It’s time-consuming to clear up all info especially for a major >> release >> > > with lots of PRs. >> > > >> > > If RN is regarded as an afterthought and finished as a last-minute >> task, it >> > > is likely not written well. >> > > Instead of rushing, treating RN as a part of development not only >> reduces >> > > RM's workload and makes communication more coordinated, >> > > but also allows more time for us to choose the most valuable >> highlights >> > > shown to users. >> > > Consequently, the process of the current workflow should be improved. >> > > >> > > Therefore, I propose the PIP 112: Generate Release Notes >> Automatically [2] >> > > and add some initial thoughts and research there. >> > > It is only a draft but I would like to invite you to join us to bring >> > > another major change to Pulsar. I believe this would bring many >> benefits to >> > > all of us, thanks! >> > > >> > > [1] https://lists.apache.org/thread/dl3jb9p3zvlc6ntlkpmxf1m8dw5dcd8z >> > > [2] >> > > >> https://github.com/apache/pulsar/wiki/PIP-112%3A-Generate-Release-Notes-Automatically >> > >> >
[VOTE] Apache Pulsar 2.9.1 candidate 2
This is the second release candidate for Apache Pulsar, version 2.9.1. The first release candidate was aborted without starting a VOTE because we had to pick up high priority dependency upgrades. It fixes the following issues: https://github.com/apache/pulsar/pulls?q=is%3Apr++label%3Arelease%2F2.9.1+ *** Please download, test and vote on this release. This vote will stay open for at least 72 hours *** Note that we are voting upon the source (tag), binaries are provided for convenience. Source and binary files: https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.9.1-candidate-2/ SHA-512 checksums: 5ca7d2c6a8ac51413214796481095bbde50b5bda95d8b8f2467989931b29c75e679aabcfebd82e9e3e90dd1644c580214e0a05eca8652a500f042c84cb21becd apache-pulsar-2.9.1-bin.tar.gz 34a1e22fb0ff2e69e7e880a9432526990610113cf89d93c953dff82cc443510dcf724eaa0e1fade82464f9bf5443655bd23bcf2064e312c4a9da70bb4c9937ba apache-pulsar-2.9.1-src.tar.gz Maven staging repo: https://repository.apache.org/content/repositories/orgapachepulsar-1110 The tag to be voted upon: v2.9.1-candidate-2 (f52ac045f41acbb6c31da21a3463df3cfbe8f1b4) https://github.com/apache/pulsar/releases/tag/v2.9.1-candidate-2 Link to the release notes: https://github.com/apache/pulsar/pull/13357 Pulsar's KEYS file containing PGP keys we use to sign the release: https://dist.apache.org/repos/dist/dev/pulsar/KEYS Please download the source package, and follow the README to build and run the Pulsar standalone service. Enrico Olivelli
Re: [VOTE] Apache Pulsar 2.9.1 candidate 2
+1 (non binding) Checks: - Checksum and signatures - Apache Rat check passes - OWASP check passes (I created this PR for fix a false positive https://github.com/apache/pulsar/pull/13364) - Compile from source w JDK11 - Build docker image from source - Run Pulsar standalone and produce-consume from CLI - verified the presence of Log4j 2.16.0 jar in docker and tarball Il giorno gio 16 dic 2021 alle ore 14:25 Enrico Olivelli < eolive...@gmail.com> ha scritto: > This is the second release candidate for Apache Pulsar, version 2.9.1. > > The first release candidate was aborted without starting a VOTE because we > had to pick up high priority dependency upgrades. > > It fixes the following issues: > https://github.com/apache/pulsar/pulls?q=is%3Apr++label%3Arelease%2F2.9.1+ > > *** Please download, test and vote on this release. This vote will stay > open > for at least 72 hours *** > > Note that we are voting upon the source (tag), binaries are provided for > convenience. > > Source and binary files: > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.9.1-candidate-2/ > > SHA-512 checksums: > > > 5ca7d2c6a8ac51413214796481095bbde50b5bda95d8b8f2467989931b29c75e679aabcfebd82e9e3e90dd1644c580214e0a05eca8652a500f042c84cb21becd > apache-pulsar-2.9.1-bin.tar.gz > > 34a1e22fb0ff2e69e7e880a9432526990610113cf89d93c953dff82cc443510dcf724eaa0e1fade82464f9bf5443655bd23bcf2064e312c4a9da70bb4c9937ba > apache-pulsar-2.9.1-src.tar.gz > > Maven staging repo: > https://repository.apache.org/content/repositories/orgapachepulsar-1110 > > The tag to be voted upon: > v2.9.1-candidate-2 (f52ac045f41acbb6c31da21a3463df3cfbe8f1b4) > https://github.com/apache/pulsar/releases/tag/v2.9.1-candidate-2 > > Link to the release notes: > https://github.com/apache/pulsar/pull/13357 > > Pulsar's KEYS file containing PGP keys we use to sign the release: > https://dist.apache.org/repos/dist/dev/pulsar/KEYS > > Please download the source package, and follow the README to build > and run the Pulsar standalone service. > > > Enrico Olivelli > -- Nicolò Boschi
Re: [DISCUSS] How to handle stale PRs
I just saw another project - https://github.com/openmessaging/benchmark uses probot-stale https://github.com/probot/stale This looks like it has all the features needed to close both stale issues and PRs. It allows labels to be used to prevent closure of certain issues and PRs. Here is their configuration: https://github.com/openmessaging/benchmark/blob/master/.github/stale.yml This bot is allowed in GitHub.com/apache/ where 11 repositories are currently using it. When we are ready we will simply create an INFRA JIRA. > On Dec 15, 2021, at 4:15 PM, Dave Fisher wrote: > > > >> On Dec 15, 2021, at 4:06 PM, Matteo Merli wrote: >> >>> Is #3267 Support set publish time on broker side one of those very valuable >>> ideas that was later rejected, likely for performance reasons? >> >> No, this was one that was superseded by other changes. > > Then I’ll close it. > >> >>> One problem with the current state is that PRs and even higher level ideas >>> have a shelf life. Declaring PR bankruptcy does in fact solve this problem. >> >> I don't believe that is true in all cases and I absolutely don't >> believe that it is not possible to keep up with the PRs, when the >> reviewing workload is well balanced. > > >> >> I'm seeing a lot of opinions here, but at the end of the day the >> people doing the hard work of reviewing are always the same few ones. > > (1) These are opinions about how to do the work. If you want someone to JFDI > it then I’m happy to start closing and labeling as I suggested. I started closing PRs with a new label - status/stale https://github.com/apache/pulsar/issues?q=label%3Astatus%2Fstale+is%3Aclosed > > (2) There is a kind of deference being shown to those individuals based on > who the contributor selects for review. I wish there was a way for a > contributor to ask the dev list for a review. I plan to research how we might modify how reviews are requested. I think that can be in another thread. > > >> >>> Once we have guidance, I am happy to add it to the Committer Guide on >> the wiki [0]. >> >> Michael, I agree 100% with that. We should write clear guidelines to >> describe when it makes sense to close, leave for the record, call for >> "help" to continue working on and so on. That will help committers and >> contributors. >> >>> Matteo, your comment raises an additional question for me. What are >>> Apache's rules for completing someone else's contribution? If someone >>> opens a PR to fix a bug, but it is incomplete and they become >>> unresponsive, how can we move their contribution forward? These are >>> the PRs we don't want to close. >> >> I don't think there is any problem in completing someone else's PR, >> provided that: >> * The original author is non-responsive or has no time to work on it >> at the moment (otherwise it would be kind of rude). >> * We give the right credit to the original author (github has good >> support for multiple authorship of a commit) >> >> Continuing with a PR is not very different from merging the WIP and >> fixing it later in a second commit, from a legal perspective. >> >> IANAL, though *AFAIU*, when a contributor is opening a PR is already >> assigning the IP to the ASF. A committer will merge that code (after >> due diligence that it doesn't contain inadmissible code), but the code >> is already "donated to the ASF" at the moment of the PR. > > +1. > > Regards, > Dave > >> >> >> -- >> Matteo Merli >> >> >> >> On Wed, Dec 15, 2021 at 3:14 PM Chris Herzog >> wrote: >>> >>> It isn't even an issue related to OSS - every long lived project suffers >>> from this same issue. Whether it's a long lingering defect report or a fix >>> that never got integrated in a timely manner, time wounds all heels. >>> >>> Careful considered review is perfection which can't be hit; if it could be >>> done, the situation would never have occured in the first place. Having a >>> time-to-live is pragmatic, not perfect, but pragmatic. >>> >>> As Jonathan mentioned, if ideas or changes linger too long, they often are >>> superceded or replaced with more applicable alternatives or might not have >>> been that important in the first place. It's a shame because each >>> languishing PR represents some amount of work from someone (sometimes a >>> non-trivial amount) but there really isn't a more practical alternative IMO. >>> >>> >>> >>> On Wed, Dec 15, 2021 at 5:05 PM Jonathan Ellis wrote: >>> One problem with the current state is that PRs and even higher level ideas have a shelf life. Declaring PR bankruptcy does in fact solve this problem. The other problem is that from a new contributor's perspective it's impossible to tell which issues are relevant and which are clutter that we haven't gotten around to closing out. For this, declaring PR bankruptcy isn't as good as somehow having the capacity to review and respond to everything, but it's still better than the
Re: [VOTE] Apache Pulsar 2.9.1 candidate 2
I have pushed the docker images to my personal dockehub account eolivelli/pulsar:2.9.1rc2 eolivelli/pulsar-all:2.9.1rc2 Enrico Il Gio 16 Dic 2021, 15:57 Nicolò Boschi ha scritto: > +1 (non binding) > > Checks: > - Checksum and signatures > - Apache Rat check passes > - OWASP check passes (I created this PR for fix a false positive > https://github.com/apache/pulsar/pull/13364) > - Compile from source w JDK11 > - Build docker image from source > - Run Pulsar standalone and produce-consume from CLI > - verified the presence of Log4j 2.16.0 jar in docker and tarball > > Il giorno gio 16 dic 2021 alle ore 14:25 Enrico Olivelli < > eolive...@gmail.com> ha scritto: > > > This is the second release candidate for Apache Pulsar, version 2.9.1. > > > > The first release candidate was aborted without starting a VOTE because > we > > had to pick up high priority dependency upgrades. > > > > It fixes the following issues: > > > https://github.com/apache/pulsar/pulls?q=is%3Apr++label%3Arelease%2F2.9.1+ > > > > *** Please download, test and vote on this release. This vote will stay > > open > > for at least 72 hours *** > > > > Note that we are voting upon the source (tag), binaries are provided for > > convenience. > > > > Source and binary files: > > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.9.1-candidate-2/ > > > > SHA-512 checksums: > > > > > > > 5ca7d2c6a8ac51413214796481095bbde50b5bda95d8b8f2467989931b29c75e679aabcfebd82e9e3e90dd1644c580214e0a05eca8652a500f042c84cb21becd > > apache-pulsar-2.9.1-bin.tar.gz > > > > > 34a1e22fb0ff2e69e7e880a9432526990610113cf89d93c953dff82cc443510dcf724eaa0e1fade82464f9bf5443655bd23bcf2064e312c4a9da70bb4c9937ba > > apache-pulsar-2.9.1-src.tar.gz > > > > Maven staging repo: > > https://repository.apache.org/content/repositories/orgapachepulsar-1110 > > > > The tag to be voted upon: > > v2.9.1-candidate-2 (f52ac045f41acbb6c31da21a3463df3cfbe8f1b4) > > https://github.com/apache/pulsar/releases/tag/v2.9.1-candidate-2 > > > > Link to the release notes: > > https://github.com/apache/pulsar/pull/13357 > > > > Pulsar's KEYS file containing PGP keys we use to sign the release: > > https://dist.apache.org/repos/dist/dev/pulsar/KEYS > > > > Please download the source package, and follow the README to build > > and run the Pulsar standalone service. > > > > > > Enrico Olivelli > > > > > -- > Nicolò Boschi >
Re: [VOTE] Apache Pulsar 2.9.1 candidate 2
+1 Checked: * Signatures * Bin distribution: - NOTICE, README, LICENSE - Start standalone service and producer/consumer test * Src distribution: - NOTICE, README, LICENSE - Compile and unit tests - Start standalone service * Checked staging maven repository artifacts * Checked docker images Matteo -- Matteo Merli On Thu, Dec 16, 2021 at 12:53 PM Enrico Olivelli wrote: > > I have pushed the docker images to my personal dockehub account > > eolivelli/pulsar:2.9.1rc2 > eolivelli/pulsar-all:2.9.1rc2 > > Enrico > > Il Gio 16 Dic 2021, 15:57 Nicolò Boschi ha scritto: > > > +1 (non binding) > > > > Checks: > > - Checksum and signatures > > - Apache Rat check passes > > - OWASP check passes (I created this PR for fix a false positive > > https://github.com/apache/pulsar/pull/13364) > > - Compile from source w JDK11 > > - Build docker image from source > > - Run Pulsar standalone and produce-consume from CLI > > - verified the presence of Log4j 2.16.0 jar in docker and tarball > > > > Il giorno gio 16 dic 2021 alle ore 14:25 Enrico Olivelli < > > eolive...@gmail.com> ha scritto: > > > > > This is the second release candidate for Apache Pulsar, version 2.9.1. > > > > > > The first release candidate was aborted without starting a VOTE because > > we > > > had to pick up high priority dependency upgrades. > > > > > > It fixes the following issues: > > > > > https://github.com/apache/pulsar/pulls?q=is%3Apr++label%3Arelease%2F2.9.1+ > > > > > > *** Please download, test and vote on this release. This vote will stay > > > open > > > for at least 72 hours *** > > > > > > Note that we are voting upon the source (tag), binaries are provided for > > > convenience. > > > > > > Source and binary files: > > > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.9.1-candidate-2/ > > > > > > SHA-512 checksums: > > > > > > > > > > > 5ca7d2c6a8ac51413214796481095bbde50b5bda95d8b8f2467989931b29c75e679aabcfebd82e9e3e90dd1644c580214e0a05eca8652a500f042c84cb21becd > > > apache-pulsar-2.9.1-bin.tar.gz > > > > > > > > 34a1e22fb0ff2e69e7e880a9432526990610113cf89d93c953dff82cc443510dcf724eaa0e1fade82464f9bf5443655bd23bcf2064e312c4a9da70bb4c9937ba > > > apache-pulsar-2.9.1-src.tar.gz > > > > > > Maven staging repo: > > > https://repository.apache.org/content/repositories/orgapachepulsar-1110 > > > > > > The tag to be voted upon: > > > v2.9.1-candidate-2 (f52ac045f41acbb6c31da21a3463df3cfbe8f1b4) > > > https://github.com/apache/pulsar/releases/tag/v2.9.1-candidate-2 > > > > > > Link to the release notes: > > > https://github.com/apache/pulsar/pull/13357 > > > > > > Pulsar's KEYS file containing PGP keys we use to sign the release: > > > https://dist.apache.org/repos/dist/dev/pulsar/KEYS > > > > > > Please download the source package, and follow the README to build > > > and run the Pulsar standalone service. > > > > > > > > > Enrico Olivelli > > > > > > > > > -- > > Nicolò Boschi > >
Re: [DISCUSS] Release Pulsar 2.7.4
Hi, After we have fixed some issue like ZookeeperCache NPE, listing namespace exception, and skip some flaky tests (verified locally), now the CI have passed. Skipped flaky tests are tracked here: https://github.com/apache/pulsar/issues/13299 Now we decide to vote for releasing 2.7.4. Regards Jiwei Guo (Tboy) On Tue, Dec 14, 2021 at 11:58 AM PengHui Li wrote: > Thanks for the update, I will move it 2.7.5 > > Thanks, > Penghui > > On Tue, Dec 14, 2021 at 9:47 AM Matteo Merli > wrote: > > > Let's take https://github.com/apache/pulsar/pull/12484 out of the > > picture since it's failing the tests. > > > > > > -- > > Matteo Merli > > > > > > On Sun, Dec 12, 2021 at 11:06 PM PengHui Li wrote: > > > > > > Yes, > > > > > > https://github.com/apache/pulsar/pull/13215 has cherry-picked, so we > can > > > close it. > > > https://github.com/apache/pulsar/pull/12484 blocked by the test. > > > > > > Penghui > > > > > > On Mon, Dec 13, 2021 at 2:35 PM Dave Fisher > > wrote: > > > > > > > I see 2 PRs still open at > > > > > > > https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+label%3Arelease%2F2.7.4 > > > > > > > > Sent from my iPhone > > > > > > > > > On Dec 12, 2021, at 8:22 PM, guo jiwei > wrote: > > > > > > > > > > I have pushed out some fixes in > > > > https://github.com/apache/pulsar/pull/13243 > > > > > After the tests get passed, I will send out the RC-1 VOTE for 2.7.4 > > > > > > > > > > Regards > > > > > Jiwei Guo (Tboy) > > > > > > > > > > > > > > >> On Sun, Dec 12, 2021 at 3:11 PM PengHui Li > > wrote: > > > > >> > > > > >> Just put an update here. We have done the PR cherry-picking > > > > >> > > > > >> https://github.com/apache/pulsar/commits/branch-2.7 > > > > >> > > > > >> And most of the integration tests are fixed due to the docker > image > > > > issue > > > > >> or the testcontainer issue, now some integration tests get passed, > > but > > > > some > > > > >> are not. > > > > >> And there are some failed tests, maybe a flaky test, we need to > > ensure > > > > it's > > > > >> not a regression. > > > > >> > > > > >> We are continuing on the test part. > > > > >> > > > > >> Penghui > > > > >> > > > > >> > > > > >> > > > > >>> On Sat, Dec 11, 2021 at 5:36 PM PengHui Li > > wrote: > > > > >>> > > > > >>> Hi Michael, > > > > >>> > > > > >>> +1, > > > > >>> > > > > >>> Thanks for the great work. > > > > >>> We will continue on the PR cherry-picking and the release process > > to > > > > make > > > > >>> sure the urgent release can be done ASAP. > > > > >>> > > > > >>> Penghui > > > > >>> > > > > >>> On Sat, Dec 11, 2021 at 3:42 PM Michael Marshall < > > mmarsh...@apache.org > > > > > > > > > >>> wrote: > > > > >>> > > > > Given the log4j CVE, we should work to release 2.7.4. > > > > > > > > I started preparing the release today by cherry-picking merged > PRs > > > > that have the `release/2.7.4` label but have not yet been > > > > cherry-picked to `branch-2.7` [0]. There are still 37 PRs that > > have > > > > not been cherry picked. I think it will take too long to cherry > > pick > > > > all of these commits, as many have conflicts, and we should > > prioritize > > > > releasing 2.7.4. The main commits that we should get > cherry-picked > > > > before creating the git tag are any labeled with > > `component/security`. > > > > There are only a few remaining commits to cherry pick. Please > let > > me > > > > know if you think any other commits ought to be cherry-picked. > > > > > > > > The earliest I'll be able to build the release is Monday. If we > > need > > > > to start sooner, perhaps someone else will be available to > manage > > this > > > > urgent release. > > > > > > > > Thanks, > > > > Michael > > > > > > > > [0] - > > > > > > > > >> > > > > > > > https://github.com/apache/pulsar/pulls?page=2&q=label%3Arelease%2F2.7.4+sort%3Acreated-asc+is%3Apr+-label%3Acherry-picked%2Fbranch-2.7 > > > > [1] - > > > > > > > > >> > > > > > > > https://github.com/apache/pulsar/pulls?q=label%3Arelease%2F2.7.4+sort%3Acreated-asc+is%3Apr+-label%3Acherry-picked%2Fbranch-2.7+label%3Acomponent%2Fsecurity > > > > > > > > > > > > On Thu, Dec 9, 2021 at 4:03 PM Neng Lu > wrote: > > > > > > > > > > +1 > > > > > > > > > > On 2021/12/09 15:29:55 Michael Marshall wrote: > > > > >> Hello Pulsar Community, > > > > >> > > > > >> I'd like to propose that we release 2.7.4. We have merged > > several > > > > >> important fixes since we released 2.7.3 in August. > > > > >> > > > > >> I am happy to volunteer to be the release manager. > > > > >> > > > > >> Here [0] you can find the list of 36 commits cherry-picked to > > > > >> branch-2.7 since 2.7.3 release. It looks like there are more > PRs > > > > >> labeled with `release/2.7.4` than commits cherry-picked, so I > > will > > > > >> need to work on cherry-picking those befo
Re: [DISCUSSION] PIP-117: Change Pulsar standalone defaults
+1 On Tue, Dec 14, 2021 at 9:18 AM Matteo Merli wrote: > https://github.com/apache/pulsar/issues/13302 > > Copying here for quoting convenience > > > > > > ## Motivation > > Pulsar standalone is the "Pulsar in a box" version of a Pulsar cluster, > where > all the components are started within the context of a single JVM process. > > Users are using the standalone as a way to get quickly started with Pulsar > or > in all the cases where it makes sense to have a single node deployment. > > Right now, the standalone is starting by default with many components, > several of > which are quite complex, since they are designed to be deployed in a > distributed > fashion. > > ## Goal > > Simplify the components of Pulsar standalone to achieve: > > 1. Reduce complexity > 2. Reduce startup time > 3. Reduce memory and CPU footprint of running standalone > > ## Proposed changes > > The proposal here is to change some of the default implementations that are > used for the Pulsar standalone. > > 1. **Metadata Store implementation** --> > Change from ZooKeeper to RocksDB > > 2. **Pulsar functions package backend** --> > Change from using DistributedLog to using local filesystem, storing > the > jars directly in the data folder instead of uploading them into BK. > > 3. **Pulsar functions state store implementation** --> > Change the state store to be backed by a MetadataStore based backed, > with the RocksDB implementation. > > 4. **Table Service** --> > Do not start BK table service by default > > ## Compatibility considerations > > In order to avoid compatibility issues where users have existing Pulsar > standalone services that they want to upgrade without conflicts, we will > follow the principle of keeping the old defaults where there is existing > data on the disk. > > We will add a file, serving the purpose as a flag, in the `data/standalone` > directory, for example `new-2.10-defaults`. > > If the file is present, or if the data directory is completely missing, we > will adopt the new set of default configuration settings. > > If the file is not there, we will continue to use existing defaults and we > will > not break the upgrade operation. > > > > > > -- > Matteo Merli > >
Re: [DISCUSSION] PIP-118: Do not restart brokers when ZooKeeper session expires
+1 On Tue, Dec 14, 2021 at 10:03 AM Matteo Merli wrote: > https://github.com/apache/pulsar/issues/13304 > > > Pasted below for quoting convenience. > > --- > > > ## Motivation > > After all the work done for PIP-45 that was already included in 2.8 and 2.9 > releases, it enabled the concept of re-acquirable resource locks and leader > election. > > Another important change was to avoid doing any deferrable metadata > operation > when we know that we are not currently connected to the metadata service. > > Finally, that enabled stabilization in 2.9 the configuration setting that > allows > brokers to continue operating in a safe mode when the session with > ZooKeeper > expires. > > The way it works is that, when we lose a ZooKeeper session, the data plane > will > continue to work undisturbed, relying on the BookKeeper fencing to avoid > any > inconsistencies. > > New topics are not able to get started, but existing topics will see no > impact. > > The original intention for shutting down the brokers was to ensure that we > would automatically go back to a consistent state, with respect to which > resources are "owned" in ZooKeeper by a given broker. > > With the re-acquirable resource locks, that problem was solved and > thoroughly > tested to be robust. > > ## Proposed changes > > In 2.10 release, for the setting: > > ```properties > # There are two policies to apply when a broker metadata session > expires: session expired happens, "shutdown" or "reconnect". > # With "shutdown", the broker will be restarted. > # With "reconnect", the broker will keep serving the topics, while > attempting to recreate a new session. > zookeeperSessionExpiredPolicy=shutdown > ``` > > Change its default value to `reconnect`. > > > -- > Matteo Merli > >
Re: [DISCUSSION] PIP-119: Enable consistent hashing by default on KeyShared dispatcher
+1 On Tue, Dec 14, 2021 at 10:15 AM Matteo Merli wrote: > Pasted below for quoting convenience. > > > > > ## Motivation > > The consistent hashing implementation to uniformly assign keys to consumers > in the context of a KeyShared subscription, was introduced in > https://github.com/apache/pulsar/pull/6791, which was released in Pulsar > 2.6.0. > > While consistent hashing can use slightly more memory in certain cases, it > is > more suitable as a general default implementation, as it leads to a fairer > distribution of keys across consumers, and avoiding corner cases that > depend > on the sequence of addition/removal of consumers. > > ## Proposed changes > > In 2.10 release, for the setting: > > ```properties > # On KeyShared subscriptions, with default AUTO_SPLIT mode, use > splitting ranges or > # consistent hashing to reassign keys to new consumers > subscriptionKeySharedUseConsistentHashing=false > ``` > > Change its default value to `true`. > > The `AUTO_SPLIT` mode will not be removed nor deprecated. Users will still > be > able to use the old implementation. > > > > -- > Matteo Merli > >
Re: [DISCUSSION] PIP-120: Enable client memory limit by default
+1 On Tue, Dec 14, 2021 at 11:20 AM Matteo Merli wrote: > https://github.com/apache/pulsar/issues/13306 > > > Pasted below for quoting convenience. > > > > > ## Motivation > > In Pulsar 2.8, we have introduced a setting to control the amount of memory > used by a client instance. > > ```java > interface ClientBuilder { > ClientBuilder memoryLimit(long memoryLimit, SizeUnit unit); > } > ``` > > By default, in 2.8 and 2.9 this setting is set to 0, meaning no limit is > being > enforced. > > I think it's a good time for 2.10 to enable this setting by default and, > correspondingly, to disable by default the producer queue size limit. > > This will simplify a lot the configuration that a producer application will > have to come up with, when publishing with many topic/partitions or > when messages > are bigger than expected. > > ## Proposed changes > > In 2.10 release, for the `ClientBuilder`, change > * `memoryLimit`: 0 -> 64 MB > > For the `ProducerBuilder`, changes > * `maxPendingMessages`: 1000 -> 0 > > 64MB is picked because it's a small enough memory size that will guarantee > a very high producer throughput, irrespective of the individual messages > size. > > > > -- > Matteo Merli > >
Re: [DISCUSSION] PIP-120: Enable client memory limit by default
+1 Sijie Guo 於 2021年12月17日 週五 12:38 寫道: > +1 > > On Tue, Dec 14, 2021 at 11:20 AM Matteo Merli wrote: > > > https://github.com/apache/pulsar/issues/13306 > > > > > > Pasted below for quoting convenience. > > > > > > > > > > ## Motivation > > > > In Pulsar 2.8, we have introduced a setting to control the amount of > memory > > used by a client instance. > > > > ```java > > interface ClientBuilder { > > ClientBuilder memoryLimit(long memoryLimit, SizeUnit unit); > > } > > ``` > > > > By default, in 2.8 and 2.9 this setting is set to 0, meaning no limit is > > being > > enforced. > > > > I think it's a good time for 2.10 to enable this setting by default and, > > correspondingly, to disable by default the producer queue size limit. > > > > This will simplify a lot the configuration that a producer application > will > > have to come up with, when publishing with many topic/partitions or > > when messages > > are bigger than expected. > > > > ## Proposed changes > > > > In 2.10 release, for the `ClientBuilder`, change > > * `memoryLimit`: 0 -> 64 MB > > > > For the `ProducerBuilder`, changes > > * `maxPendingMessages`: 1000 -> 0 > > > > 64MB is picked because it's a small enough memory size that will > guarantee > > a very high producer throughput, irrespective of the individual messages > > size. > > > > > > > > -- > > Matteo Merli > > > > >
Re: [DISCUSSION] PIP-118: Do not restart brokers when ZooKeeper session expires
+1 Enrico Il Ven 17 Dic 2021, 05:36 Sijie Guo ha scritto: > +1 > > On Tue, Dec 14, 2021 at 10:03 AM Matteo Merli wrote: > > > https://github.com/apache/pulsar/issues/13304 > > > > > > Pasted below for quoting convenience. > > > > --- > > > > > > ## Motivation > > > > After all the work done for PIP-45 that was already included in 2.8 and > 2.9 > > releases, it enabled the concept of re-acquirable resource locks and > leader > > election. > > > > Another important change was to avoid doing any deferrable metadata > > operation > > when we know that we are not currently connected to the metadata service. > > > > Finally, that enabled stabilization in 2.9 the configuration setting that > > allows > > brokers to continue operating in a safe mode when the session with > > ZooKeeper > > expires. > > > > The way it works is that, when we lose a ZooKeeper session, the data > plane > > will > > continue to work undisturbed, relying on the BookKeeper fencing to avoid > > any > > inconsistencies. > > > > New topics are not able to get started, but existing topics will see no > > impact. > > > > The original intention for shutting down the brokers was to ensure that > we > > would automatically go back to a consistent state, with respect to which > > resources are "owned" in ZooKeeper by a given broker. > > > > With the re-acquirable resource locks, that problem was solved and > > thoroughly > > tested to be robust. > > > > ## Proposed changes > > > > In 2.10 release, for the setting: > > > > ```properties > > # There are two policies to apply when a broker metadata session > > expires: session expired happens, "shutdown" or "reconnect". > > # With "shutdown", the broker will be restarted. > > # With "reconnect", the broker will keep serving the topics, while > > attempting to recreate a new session. > > zookeeperSessionExpiredPolicy=shutdown > > ``` > > > > Change its default value to `reconnect`. > > > > > > -- > > Matteo Merli > > > > >
[PR] Pulsar non root docker image
Hi Pulsar Community, I opened a PR to make our pulsar and pulsar-all docker images non root and OpenShift compliant [0]. As some may remember, we had issues with these changes before due to lack of testing. I plan to test thoroughly before we merge this PR, and it'd be great to have others test too. I published a build of my PR [1]. I also have an issue [2] tracking this work. Please take a look. I hope to make our 2.10 release a non root release! Thanks, Michael [0] https://github.com/apache/pulsar/pull/13376 [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT [2] https://github.com/apache/pulsar/issues/11269
Re: [DISCUSSION] PIP-120: Enable client memory limit by default
+1 On Fri, 17 Dec 2021 at 13:56, 陳智弘 wrote: > +1 > > Sijie Guo 於 2021年12月17日 週五 12:38 寫道: > > > +1 > > > > On Tue, Dec 14, 2021 at 11:20 AM Matteo Merli wrote: > > > > > https://github.com/apache/pulsar/issues/13306 > > > > > > > > > Pasted below for quoting convenience. > > > > > > > > > > > > > > > ## Motivation > > > > > > In Pulsar 2.8, we have introduced a setting to control the amount of > > memory > > > used by a client instance. > > > > > > ```java > > > interface ClientBuilder { > > > ClientBuilder memoryLimit(long memoryLimit, SizeUnit unit); > > > } > > > ``` > > > > > > By default, in 2.8 and 2.9 this setting is set to 0, meaning no limit > is > > > being > > > enforced. > > > > > > I think it's a good time for 2.10 to enable this setting by default > and, > > > correspondingly, to disable by default the producer queue size limit. > > > > > > This will simplify a lot the configuration that a producer application > > will > > > have to come up with, when publishing with many topic/partitions or > > > when messages > > > are bigger than expected. > > > > > > ## Proposed changes > > > > > > In 2.10 release, for the `ClientBuilder`, change > > > * `memoryLimit`: 0 -> 64 MB > > > > > > For the `ProducerBuilder`, changes > > > * `maxPendingMessages`: 1000 -> 0 > > > > > > 64MB is picked because it's a small enough memory size that will > > guarantee > > > a very high producer throughput, irrespective of the individual > messages > > > size. > > > > > > > > > > > > -- > > > Matteo Merli > > > > > > > > >
Re: [VOTE] Apache Pulsar 2.9.1 candidate 2
Checked: - Build from the src - Check signatures - Follow the validation process But when I try to verify PulsarSQL, got following exceptions: ``` 2021-12-17T14:58:18.958+0800 ERROR remote-task-callback-3 io.prestosql.execution.StageStateMachine Stage 20211217_065818_1_cahiv.1 failed com.google.common.util.concurrent.UncheckedExecutionException: java.nio.BufferUnderflowException at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) at com.google.common.cache.LocalCache.get(LocalCache.java:3951) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4935) at org.apache.pulsar.sql.presto.PulsarSqlSchemaInfoProvider.getSchemaByVersion(PulsarSqlSchemaInfoProvider.java:76) at org.apache.pulsar.sql.presto.PulsarRecordCursor.advanceNextPosition(PulsarRecordCursor.java:485) at io.prestosql.spi.connector.RecordPageSource.getNextPage(RecordPageSource.java:90) at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:302) at io.prestosql.operator.Driver.processInternal(Driver.java:379) at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283) at io.prestosql.operator.Driver.tryWithLock(Driver.java:675) at io.prestosql.operator.Driver.processFor(Driver.java:276) at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075) at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163) at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484) at io.prestosql.$gen.Presto_332__testversion20211217_065757_2.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:532) at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:417) at org.apache.pulsar.sql.presto.PulsarSqlSchemaInfoProvider.loadSchema(PulsarSqlSchemaInfoProvider.java:106) at org.apache.pulsar.sql.presto.PulsarSqlSchemaInfoProvider.access$000(PulsarSqlSchemaInfoProvider.java:49) at org.apache.pulsar.sql.presto.PulsarSqlSchemaInfoProvider$1.load(PulsarSqlSchemaInfoProvider.java:61) at org.apache.pulsar.sql.presto.PulsarSqlSchemaInfoProvider$1.load(PulsarSqlSchemaInfoProvider.java:58) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2155) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045) ... 18 more ``` An issue can be found here https://github.com/apache/pulsar/issues/12284, my test steps are very simple: 1. Start presto worker, `bin/pulsar sql-worker run` 2. Produce some messages, `bin/pulsar-client produce -m "hello" -n 10 test_wordcount_src` 3. Query the data from the topic, `select * from pulsar."public/default"."test_wordcount_src";` Not able to query the produced data and get errors in the Pulsar SQL worker. Penghui On Fri, Dec 17, 2021 at 5:33 AM Matteo Merli wrote: > +1 > > Checked: > * Signatures > * Bin distribution: > - NOTICE, README, LICENSE > - Start standalone service and producer/consumer test > * Src distribution: > - NOTICE, README, LICENSE > - Compile and unit tests > - Start standalone service > * Checked staging maven repository artifacts > * Checked docker images > > > Matteo > > -- > Matteo Merli > > > > > On Thu, Dec 16, 2021 at 12:53 PM Enrico Olivelli > wrote: > > > > I have pushed the docker images to my personal dockehub account > > > > eolivelli/pulsar:2.9.1rc2 > > eolivelli/pulsar-all:2.9.1rc2 > > > > Enrico > > > > Il Gio 16 Dic 2021, 15:57 Nicolò Boschi ha > scritto: > > > > > +1 (non binding) > > > > > > Checks: > > > - Checksum and signatures > > > - Apache Rat check passes > > > - OWASP check passes (I created this PR for fix a false positive > > > https://github.com/apache/pulsar/pull/13364) > > > - Compile from source w JDK11 > > > - Build docker image from source > > > - Run Pulsar standalone and produce-consume from CLI > > > - verified the presence of Log4j 2.16.0 jar in docker and tarball > > > > > > Il giorno gio 16 dic 2021 alle ore 14:25 Enrico Olivelli < > > > eolive...@gmail.com> ha scritto: > > > > > > > This is the second release candidate for Apache Pulsar, version > 2.9.1. > > > > > > > > The first release candidate was aborted without starting a VOTE > because > > > we > > > > had to pick up high priority dependency upgrades. > > > > > > > > It fixes the following issues: > > > > > > > > https://github.com/apache/pulsar/pulls?q=is%3Apr++label%3Arelease%2F2.9.1+ > > > > > > > > *** Please download, test and vote on this release
Re: Dropping Presto SQL in 2.9.0 - status ?
Hi Marvin, Great work on the Trino PR! It's been a lot of work to get it to match the Trino code conventions. I hope we could drop Presto & Pulsar SQL from the apache/pulsar code repository as planned in PIP-62[1], "PIP 62: Move connectors, adapters and Pulsar Presto to separate repositories", which was created in April 2020. Let's work together to complete this effort. Is there anything that others could help with to complete the Trino PR https://github.com/trinodb/trino/pull/8020 ? BR, Lari [1] https://github.com/apache/pulsar/wiki/PIP-62%3A-Move-connectors%2C-adapters-and-Pulsar- Presto-to-separate-repositories On Wed, Nov 17, 2021 at 3:40 PM Zhengxin Cai wrote: > Hi there, > I think the pr is still open, https://github.com/trinodb/trino/pull/8020, > will try to push it. > But even after the pr is merged, I actually still think we might still want > to keep a copy of the connector in Pulsar repo and push changes to Trino > repo periodically, as this will allow much faster bug fix and feature > iteration. > Best, > Marvin, > > Lari Hotari 于2021年11月17日周三 下午2:19写道: > > > Dear Pulsar community members, > > > > PIP-62[1], "PIP 62: Move connectors, adapters and Pulsar Presto to > separate > > repositories" was created in April 2020. The repositories for > > pulsar-connectors, pulsar-adapters and pulsar-sql were created about a > year > > ago [2]. > > > > What is the current roadmap for completing PIP-62 and moving > > pulsar-connectors and pulsar-sql out of apache/pulsar repository? > > > > BR, > > > > Lari > > > > [1] > > > > > https://github.com/apache/pulsar/wiki/PIP-62%3A-Move-connectors%2C-adapters-and-Pulsar-Presto-to-separate-repositories > > [2] > > > > > https://lists.apache.org/thread.html/r9e6ec742e2896da1f0ce7d4adc7cb84fc6db6dbf797732ccdd50fb86%40%3Cdev.pulsar.apache.org%3E > > > > Other email threads: > > * [Discuss] Don't include presto/trino in the normal Pulsar distribution > - > > https://lists.apache.org/thread/jn96tct54mn0tvdot62vdslrvs38fm6d > > * Updates on Presto connector for PIP-62 - > > https://lists.apache.org/thread/f9n6sc2mrboq5sxhjbr7gvdl8vqp9fpk > > > > On Tue, Nov 2, 2021 at 3:59 PM Nicolò Boschi > wrote: > > > > > Resurrecting this thread. > > > > > > 2.9 is almost released and it hasn't been merged yet > > > > > > Extending the discussion to other connectors, it looks like there has > > been > > > no progress on PIP-62. > > > My concern is that a lot of Pulsar IO connectors dependencies we are > > > running are obsolete with several security reports > > > > > > I see there are interesting comments in the issue ( > > > https://github.com/apache/pulsar/issues/10219) and Sijie exported the > > > pulsar-io dir to https://github.com/apache/pulsar-connectors but it's > > > outdated > > > > > > From my point of view, we have to: > > > - reimport all the connectors source codes with newest ones (including > > > integration tests) > > > - add periodic CI jobs for connectors to run against master, > 2.9-latest, > > > 2.8-latest, 2.7-latest to verify breaking changes > > > - define a release cycle/management for connectors (we should improve > the > > > PIP doc). IMO it's not clear if each connector will have its own > release > > > versions and how we'll handle it (git tags, artifacts deployment..) > > > - update pulsar release script in order to get the connectors artifacts > > > (retrieving the .nar or building it from source?) > > > - update docs > > > - remove pulsar-io dir from Pulsar repo > > > > > > It's the perfect timing to schedule this work for 2.10 > > > > > > What is missing? How's the situation? Is there a roadblock I haven't > > seen? > > > I think it's better to take another discussion for Presto since it will > > > come to another end > > > > > > > > > Il giorno sab 14 ago 2021 alle ore 15:21 Enrico Olivelli < > > > eolive...@gmail.com> ha scritto: > > > > > > > Sijie > > > > > > > > Il Ven 13 Ago 2021, 22:00 Sijie Guo ha scritto: > > > > > > > > > You can follow the progress at > > > > https://github.com/trinodb/trino/pull/8020. > > > > > > > > > > > > > Thanks for the pointer > > > > > > > > > > > > > > The original code doesn't conform to TrinoDB's standard. Marvin is > > > > > actively following up on that. > > > > > > > > > > Our goal is still to get this completed as part of the 2.9 release. > > > > > > > > > > > > > Wonderful > > > > > > > > Thanks > > > > Enrico > > > > > > > > > > > > > > - Sijie > > > > > > > > > > On Fri, Aug 13, 2021 at 2:04 AM Enrico Olivelli < > eolive...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > Hello, > > > > > > How is the Presto work going ? > > > > > > IIRC the plan was to remove it from the Pulsar code base and let > it > > > be > > > > > > hosted at Trino. > > > > > > > > > > > > If this is not going to happen within the 2.9.0 release timeline > > > > > > (September?) I would prefer to upgrade to "Trino". > > > > > > Probably we will have a downside problem that recent versions of > > > > > > Presto/Trino