Re: Re-running failed flaky builds in refactored Pulsar CI GitHub Actions workflow
GitHub Actions has some problem and the UI has a warning "We are having problems searching workflow runs. The results may not be complete." (I can see this warning on https://github.com/apache/pulsar/actions) The impact of this is that "/pulsarbot rerun-failure-checks" doesn't work when it cannot find the failed or cancelled workflow runs. -Lari On 2022/04/08 07:01:33 Lari Hotari wrote: > With the new GitHub Actions CI workflow there are cases where you see a red > mark as a failure, but there's no need to rerun failed jobs since the red > failure marks are a result of failed test reports (usually from failed flaky > tests). > > The new Pulsar CI workflow renders Junit xml test reports and integrates them > to the GitHub UI. There are multiple benefits of this. The test failures will > be shown directly in the PR review. > > You will see red failure marks without a failed job when flaky tests fail, > but later pass in a retry. The failed test result will get recorded to a test > report, but there's no need to rerun failed jobs. > > This doesn't block merging, but will show up so that the failures can be > inspected. This can be confusing at first, since everyone has been used to > rerunning jobs when there's a red failure mark shown in the PR. > > It might appear that "/pulsarbot rerun-failure-checks" is broken. That's not > the case. Usually the issue is that there's no failed job or the workflow > where a job has failed is still executing. A failed job in a workflow can > only be rerun after the complete workflow completes. That's explained in an > earlier message in this thread. > > With test reports, there's an additional confusion, since GitHub Actions has > a bug that the test reports get attached randomly to a workflow when multiple > workflows are executing. It's a known issue and once GitHub fixes the bug, it > will be resolved. > (here's a link to one of the reports about the GitHub Actions bug: > https://github.community/t/github-actions-status-checks-created-on-incorrect-check-suite-id/16685) > > Please let me know if you have trouble with the new Pulsar CI GitHub Actions > workflow and let's try to resolve the issues together. > > I'll try to find a place to document the details that are mentioned in this > email thread. > > -Lari > > > On 2022/04/01 14:34:02 Lari Hotari wrote: > > I now realized that my advice to close & reopen PRs to pick up master > > branch changes is problematic. This will cause issues with "/pulsarbot > > rerun-failure-checks". The script currently looks for the build to restart > > with the PR's head commit sha. If closing and reopening is used to start > > new PR build jobs, all build jobs will have the same head commit sha > > attached to them. When checking for that failed builds, the script will > > find also old builds with the same head commit sha and also restart them. > > > > Please rebased your PR (or merge master branch changes to it) to pick up > > changes from master. Don't close & reopen PRs as I had advised earlier > > since it causes problems. The wrong builds will be run and that adds up in > > the build queue. > > > > -Lari > > > > > > > > On 2022/04/01 08:38:54 Lari Hotari wrote: > > > Hi all, > > > > > > There's a small limitation in re-running failed jobs (builds that fail > > > because of flaky tests) in the refactored Pulsar CI workflow which > > > combines multiple jobs into a single workflow. > > > > > > The limitation is that you need to wait for all jobs to complete before > > > failed jobs can be re-run. > > > Yesterday there was some issue with GitHub Actions and the build queue > > > was several hours long. When there's enough build capacity and no build > > > queue, the new workflow finishes in about 1 hour 20 minutes. > > > > > > Re-running failed jobs can be requested by commenting "/pulsarbot > > > rerun-failure-checks" on the PR. This won't do anything if one of the > > > jobs in the workflow is still executing. > > > > > > Another confusion has been the new test reporting, which shows all test > > > results and test failures as checks and annotations in the GitHub UI. > > > > > > Here's an example: > > > https://github.com/apache/pulsar/pull/14805/checks?check_run_id=5777139002 > > > > > > There's a limitation in GitHub Actions that the test reports get attached > > > to the first workflow when a PR triggers more than one workflow. We still > > > have multiple workflows and the test reports get attached to the "CI - > > > CPP, Python Tests" workflow. Failed tests will show up as red check marks > > > and in the case of retries, the test might have succeeded in a later > > > attempt, but the check shows as failed. This won't prevent merging the > > > PR. Please keep this small detail in mind when interpreting the build > > > results. > > > > > > The test reports are very verbose at the moment. This is a problem when > > > checking the PR build results on GitHub
[GitHub] [pulsar-test-infra] lhotari commented on pull request #28: fix: fix get failured checks
lhotari commented on PR #28: URL: https://github.com/apache/pulsar-test-infra/pull/28#issuecomment-1098935588 There's more details in https://lists.apache.org/thread/60x7sqg2p4mlssj5jtow6zwq3jksf6w3 . Currently there's an issue with GitHub Actions that the workflow run search doesn't return all results and that's why rerunning failed jobs doesn't work in some cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [pulsar-helm-chart] LvLs9 commented on pull request #251: make proxy httpNumThreads configurable
LvLs9 commented on PR #251: URL: https://github.com/apache/pulsar-helm-chart/pull/251#issuecomment-1099086763 Please merge that awesome PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [pulsar-manager] john1337 commented on issue #429: Bug: default user pulsar/pulsar does not have permission to create a new environment
john1337 commented on issue #429: URL: https://github.com/apache/pulsar-manager/issues/429#issuecomment-1099088156 one year passed,still no body fixed it ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [pulsar-manager] eolivelli commented on issue #429: Bug: default user pulsar/pulsar does not have permission to create a new environment
eolivelli commented on issue #429: URL: https://github.com/apache/pulsar-manager/issues/429#issuecomment-1099095246 @john1337 would you have time to send a fix? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[discuss] When broker crash and some bookie crash, whether there is an inconsistency in recover
Hi all: When the broker crashes,topic will be transferred to other active brokers, at this time, the topic will be in the fencing state and will recover, because the last confirm msgid is not the latest. steps(E=3, W=3, A=2) : 1、find the largest LAC from the bookies node where the ledger is located. 2、compare backwards, whether the entry satisfies ack >2. 3、if the conditions are met, the entry is considered safe and the LAC is updated. 4、finally update the metadata information and close the ledger My question is if bookie also crashes at this time,how to go to security settings LAC? E.g: [image: image.png] the LAC of bookie itself is brought over by the broker delay,therefore, the LAC will be different between bookies. This question has been bothering me,hope there is an accurate explanation. Thanks, Dezhi Liu
[GitHub] [pulsar-helm-chart] lhotari opened a new pull request, #259: [Proxy] Increase httpNumThreads from 8 to 16
lhotari opened a new pull request, #259: URL: https://github.com/apache/pulsar-helm-chart/pull/259 - after the recent changes in Pulsar Proxy, the proxy won't start if httpNumThreads is set to 8 and there are >= cores assigned to the pod Error message is ``` Caused by: java.lang.IllegalStateException: Insufficient configured threads: required=8 < max=8 for WebExecutorThreadPool[etp884604029]@34b9fc7d{STARTED,8<=8<=8,i=2,q=0,ReservedThreadExecutor@3fcee3d9{s=0/1,p=0}} ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Pulsar Community Meeting Notes 2022/04/14, (8:30 AM PST)
Hi Pulsar Community, Below are the meeting notes from today's community meeting. Disclaimer: I am the primary author of these notes. I took the notes while participating in the meeting discussions. It is possible that I missed or misunderstood information. If something is misattributed or misrepresented, please send a correction to this list and consider updating the Google doc. Source google doc: https://docs.google.com/document/d/19dXkVXeU2q_nHmkG8zURjKnYlvD96TbKf5KjYyASsOE Thanks, Michael 2022/04/14, (8:30 AM PST) - Attendees: - Matteo Merli - Enrico Olivelli - Andrey Yegorov - Michael Marshall - Dave Fisher - Lari Hotari - Massimiliano Mirelli - Chris Bartholomew - Hang Chen - Aaron Williams - Nicolò Boschi - Leolinchen - Penghui Li - Discussions - Enrico: 2.10 release process. Took a while. Do we want to talk about this? For 2.11, we should try to apply the new process. Matteo: 3 months from now we can release 2.11, we’ll create the branch in 2 months. Matteo plans to set a date (by discussion on the mailing list) and wants more scrutiny on the mailing list. Dave: we should slow down cherry picking to 2.8 and 2.9, as well. Enrico: we are finding many fixes though, and for example, 2.8 has many users and many bug fixes. The cherry picked commits are all bug fixes. Michael: we should add some documentation about this to help new committers. Matteo: this documentation would help inform contributors too. Dave: where should we put this? Website? Matteo: we could also put it in the PR template. - Michael: is 2.7.5 the last 2.7 release? Matteo: could keep it open for security bug fixes, like log4shell type fixes. Lari: 2.7.5 rc 1 has test failures, so we’ll need an rc 2. The tests that are failing on 2.7.5 are passing on 2.7.4. Matteo: thinking through LTS and the cost of users to do the upgrades. There is a tension between shipping new features and how frequently users have to upgrade. One issue: the upgrade/downgrade compatibility is only guaranteed for one minor version. An LTS could help to support those users without adding features. We could offer guarantees from one LTS to the next LTS. We’d define support so users could stick with a version without worrying about getting left behind. What if we did 3.0 and 4.0 and so on are LTS, then 3.x is just for features? The guarantee then is that you can go 3.x to 4.0. Dave: what about for current users using the 2.x versions? Matteo: we can discuss how to deal with existing versions, but we also need to figure out our preferred long term solution for how to work in the future. Dave: I like the idea of guaranteeing upgrade paths. Matteo: we could try to set a timeline for major releases, not just for minor releases, e.g. every 2 years for a major release. Discusses reasons for major releases and the nuance for how we could use this. Dave: are bookkeeper upgrade and transactions the major upgrade? Matteo: I didn’t have any feature in mind. I want to give people an upgrade path and create clarity. Michael: clarifies that you could upgrade from 3.0 to 4.0 then downgrade and it’d work. Matteo: yes. Feature defaults won’t be able to change because of this. Dave: relates well to creating a road map and telling people what is coming. Enrico: creating a road map is very hard in open source. We commit things that people contribute. In the ASF projects that I work, contributions are hard to predict. Matteo: I agree it is hard to know. These major releases would be loosely timed. For example, auto partitioning is a major feature, but it is a bunch of work. Unpredictability is bad for the users. Michael: and you don’t want to create a hard upgrade path. Is it possible to use geo-replication (or something like it) to migrate clusters to simplify upgrades? Matteo: there was a green-blue deployment work in progress proposal to spin up a new cluster to slow migrate producers and consumers to new cluster. The coordination would be topic termination to switch new cluster. Not sure that it is a general solution. Michael: how would breaking changes work for the major version upgrade? Matteo: we would do a compatibility layer. Also, the pulsar protocol hasn’t broken, and we version the api in such a way that the broker/client determine if the peer supports that feature. - PRs - Lari: Merged PR (https://github.com/apache/pulsar/pull/15067) to fix ManagedCursorImpl’s mark delete update logic, but asked for Matteo’s review. Lari plans to add more tests in the coming weeks to catch regressions associated with the change. - Andrey: https://github.com/apache/pulsar/pull/15142 WIP pulsar + bk 4.15-ish. Requests review of preliminary work, mentions that there is a test failure he’s still investigating. Switched CI to use Bookkeeper 4.16-SNAPSHOT to identify needed changes. Worked on tests that broke. Some test classes were copied from bookkeeper, so he replaced those with copy/pasted new ones. The work is iterative, and there are still tes
Re: Pulsar Community Meeting Notes 2022/04/14, (8:30 AM PST)
Thanks Michael for sending out the notes. Recording is available here: https://streamnative.zoom.us/rec/share/Eg2E7WfSOfPaHMdSphlrP-fN2NBjh4aT06eVTxv6TbBk4ujTltCcPNvq9kwHqMT4.mBdaRHY5eUXJM5bz Passcode: .H?wa4WM -- Matteo Merli On Thu, Apr 14, 2022 at 10:27 AM Michael Marshall wrote: > > Hi Pulsar Community, > > Below are the meeting notes from today's community meeting. > > Disclaimer: I am the primary author of these notes. I took the notes > while participating in the meeting discussions. It is possible that I > missed or misunderstood information. If something is misattributed or > misrepresented, please send a correction to this list and consider > updating the Google doc. > > Source google doc: > https://docs.google.com/document/d/19dXkVXeU2q_nHmkG8zURjKnYlvD96TbKf5KjYyASsOE > > Thanks, > Michael > > 2022/04/14, (8:30 AM PST) > - Attendees: > - Matteo Merli > - Enrico Olivelli > - Andrey Yegorov > - Michael Marshall > - Dave Fisher > - Lari Hotari > - Massimiliano Mirelli > - Chris Bartholomew > - Hang Chen > - Aaron Williams > - Nicolò Boschi > - Leolinchen > - Penghui Li > > - Discussions > > - Enrico: 2.10 release process. Took a while. Do we want to talk > about this? For 2.11, we should try to apply the new process. Matteo: > 3 months from now we can release 2.11, we’ll create the branch in 2 > months. Matteo plans to set a date (by discussion on the mailing list) > and wants more scrutiny on the mailing list. Dave: we should slow down > cherry picking to 2.8 and 2.9, as well. Enrico: we are finding many > fixes though, and for example, 2.8 has many users and many bug fixes. > The cherry picked commits are all bug fixes. Michael: we should add > some documentation about this to help new committers. Matteo: this > documentation would help inform contributors too. Dave: where should > we put this? Website? Matteo: we could also put it in the PR template. > > - Michael: is 2.7.5 the last 2.7 release? Matteo: could keep it open > for security bug fixes, like log4shell type fixes. Lari: 2.7.5 rc 1 > has test failures, so we’ll need an rc 2. The tests that are failing > on 2.7.5 are passing on 2.7.4. Matteo: thinking through LTS and the > cost of users to do the upgrades. There is a tension between shipping > new features and how frequently users have to upgrade. One issue: the > upgrade/downgrade compatibility is only guaranteed for one minor > version. An LTS could help to support those users without adding > features. We could offer guarantees from one LTS to the next LTS. We’d > define support so users could stick with a version without worrying > about getting left behind. What if we did 3.0 and 4.0 and so on are > LTS, then 3.x is just for features? The guarantee then is that you can > go 3.x to 4.0. Dave: what about for current users using the 2.x > versions? Matteo: we can discuss how to deal with existing versions, > but we also need to figure out our preferred long term solution for > how to work in the future. Dave: I like the idea of guaranteeing > upgrade paths. Matteo: we could try to set a timeline for major > releases, not just for minor releases, e.g. every 2 years for a major > release. Discusses reasons for major releases and the nuance for how > we could use this. Dave: are bookkeeper upgrade and transactions the > major upgrade? Matteo: I didn’t have any feature in mind. I want to > give people an upgrade path and create clarity. Michael: clarifies > that you could upgrade from 3.0 to 4.0 then downgrade and it’d work. > Matteo: yes. Feature defaults won’t be able to change because of this. > Dave: relates well to creating a road map and telling people what is > coming. Enrico: creating a road map is very hard in open source. We > commit things that people contribute. In the ASF projects that I work, > contributions are hard to predict. Matteo: I agree it is hard to know. > These major releases would be loosely timed. For example, auto > partitioning is a major feature, but it is a bunch of work. > Unpredictability is bad for the users. Michael: and you don’t want to > create a hard upgrade path. Is it possible to use geo-replication (or > something like it) to migrate clusters to simplify upgrades? Matteo: > there was a green-blue deployment work in progress proposal to spin up > a new cluster to slow migrate producers and consumers to new cluster. > The coordination would be topic termination to switch new cluster. Not > sure that it is a general solution. Michael: how would breaking > changes work for the major version upgrade? Matteo: we would do a > compatibility layer. Also, the pulsar protocol hasn’t broken, and we > version the api in such a way that the broker/client determine if the > peer supports that feature. > > - PRs > > - Lari: Merged PR (https://github.com/apache/pulsar/pull/15067) to > fix ManagedCursorImpl’s mark delete update logic, but asked for > Matteo’s review. Lari plans to add more tests in the coming weeks to > ca
[GitHub] [pulsar-site] urfreespace merged pull request #49: nav styling and community carousel updates
urfreespace merged PR #49: URL: https://github.com/apache/pulsar-site/pull/49 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [pulsar-test-infra] dependabot[bot] opened a new pull request, #29: Bump minimist from 1.2.0 to 1.2.6 in /paths-filter
dependabot[bot] opened a new pull request, #29: URL: https://github.com/apache/pulsar-test-infra/pull/29 Bumps [minimist](https://github.com/substack/minimist) from 1.2.0 to 1.2.6. Commits https://github.com/substack/minimist/commit/7efb22a518b53b06f5b02a1038a88bd6290c2846";>7efb22a 1.2.6 https://github.com/substack/minimist/commit/ef88b9325f77b5ee643ccfc97e2ebda577e4c4e2";>ef88b93 security notice for additional prototype pollution issue https://github.com/substack/minimist/commit/c2b981977fa834b223b408cfb860f933c9811e4d";>c2b9819 isConstructorOrProto adapted from PR https://github.com/substack/minimist/commit/bc8ecee43875261f4f17eb20b1243d3ed15e70eb";>bc8ecee test from prototype pollution PR https://github.com/substack/minimist/commit/aeb3e27dae0412de5c0494e9563a5f10c82cc7a9";>aeb3e27 1.2.5 https://github.com/substack/minimist/commit/278677b171d956b46613a158c6c486c3ef979b20";>278677b 1.2.4 https://github.com/substack/minimist/commit/4cf1354839cb972e38496d35e12f806eea92c11f";>4cf1354 security notice https://github.com/substack/minimist/commit/1043d212c3caaf871966e710f52cfdf02f9eea4b";>1043d21 additional test for constructor prototype pollution https://github.com/substack/minimist/commit/6457d7440a47f329c12c4a5abfbce211c4235b93";>6457d74 1.2.3 https://github.com/substack/minimist/commit/38a4d1caead72ef99e824bb420a2528eec03d9ab";>38a4d1c even more aggressive checks for protocol pollution Additional commits viewable in https://github.com/substack/minimist/compare/1.2.0...1.2.6";>compare view [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/apache/pulsar-test-infra/network/alerts). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [ANNOUNCE] New Committer: Zike Yang
Good for you, Zike. Congratulations. BR//Huanli > On Apr 14, 2022, at 9:28 AM, Yu wrote: > > Congrats Zike! Well deserved! > > On Wed, Apr 13, 2022 at 7:00 PM Enrico Olivelli wrote: > >> Congratulations >> >> Enrico >> >> Il Mer 13 Apr 2022, 12:38 Hang Chen ha scritto: >> >>> Congrats Zike! >>> >>> Best, >>> Hang >>> >>> Haiting Jiang 于2022年4月13日周三 18:16写道: Congrats! Thanks, Haiting On 2022/04/13 09:34:23 PengHui Li wrote: > The Apache Pulsar Project Management Committee (PMC) has invited Zike >>> Yang > https://github.com/RobertIndie to become a committer and we are >>> pleased to > announce that he has accepted. > > Welcome and Congratulations, Zike Yang! > > Please join us in congratulating and welcoming Zike Yang onboard! > > Best Regards, > Penghui Li on behalf of the Pulsar PMC > >>> >>
[GitHub] [pulsar-site] urfreespace opened a new pull request, #50: fix: sidebar missing docker and helm in next version
urfreespace opened a new pull request, #50: URL: https://github.com/apache/pulsar-site/pull/50 Signed-off-by: Li Li -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [pulsar-site] urfreespace merged pull request #50: fix: sidebar missing docker and helm in next version
urfreespace merged PR #50: URL: https://github.com/apache/pulsar-site/pull/50 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org