Hi Ard, I did some experiments with the 2 methods discussed earlier in this thread to disable the specific check being discussed in this patch. Details of the experiments and analysis of the results are included at the end of the email.
1) Use GitHub PR Label 2) Add a special tags to a commit message My conclusion is that using a PR Label to control CI actions is not a good approach due to some behaviors in GitHub and that I recommend that special tags be used in commit messages as a method to enable/disable specific CI check options. There is actually precedence for use of commit message tags in GitHub to skip CI workflow runs: https://docs.github.com/en/actions/managing-workflow-runs/skipping-workflow-runs The proposal is to define a tag name 'Continuous-integration-options:' that can appear on one or more lines of the commit message and the tag value is a list of one or more CI check options. This defines a mechanism that can be extended as needed. For the specific check under discussion here, I am proposing an option value of 'PatchCheck.ignore-multi-package' An example commit message line to disable the multiple package check for the one commit would be: Continuous-integration-options: PatchCheck.ignore-multi-package When PatchCheck evaluates the commit message, if this tag/value is present, then the multiple package check is still run, but the log shows results as WARNING messages, and no errors are raised and CI receives a PASS result. Please provide feedback on this updated proposal. Thanks, Mike ======================================================== GitHub PR Label Experiments ============================ An experiment was done to use a GitHub PR Label to enable/disable the multiple package check. PatchCheck was ported to a GitHub Action for this experiment and a label with name 'ignore-multi-package' is used to disable the PatchCheck multiple packages check. PatchCheck is updated to support a --ignore-multi-option flag that is passed into PatchCheck when the label is set. Description Result Label Set PR Link ------------------ ------- --------- ---------------------------------------- Modify 1 package PASS NO https://github.com/mdkinney/edk2/pull/15 Modify 1 package PASS YES https://github.com/mdkinney/edk2/pull/15 Modify 2 packages FAIL NO https://github.com/mdkinney/edk2/pull/16 Modify 2 packages PASS YES https://github.com/mdkinney/edk2/pull/17 Main Branch: * https://github.com/mdkinney/edk2/tree/CiEnabled Patch Check with --ignore-multi-package option: * https://github.com/mdkinney/edk2/blob/CiEnabled/BaseTools/Scripts/PatchCheck.py GitHub Action Files: * https://github.com/mdkinney/edk2/blob/CiEnabled/.github/workflows/patchcheck.yml * https://github.com/mdkinney/edk2/blob/CiEnabled/.github/workflows/patchcheck_call.yml The expected results were achieved where the label has no impact on a PR with commits that only modifies a single package, and the label being set can change a PR from FAIL->PASS if one or more of the commits in the PR modify multiple packages. The disadvantages of this approach are: 1) The scope of the label is the entire PR for all commits instead of the specific commits that require the check to be skipped. 2) The information to skip a specific check is only in the GitHub PR in the form of a label being set. The git commit history has no information on the label settings. This can cause side effects if downstream consumers use some of the edk2 CI checkers and may then see unexpected CI failures. 3) CI is run again when commits are merged with a 'push' action. It was not verified, but the label information is not provided when a push is performed, so the result of using labels to change CI behavior may be a PASS condition when the PR is being reviewed, and a FAIL condition when the commits from the PR are merged. This is not an expected or acceptable state of the main branch. 4) The GitHub action must add 'labeled' and 'unlabeled' PR types for the GitHub action to run when a label is added or removed. There is no mechanism to filter based on which specific label was added or removed. This means CI must be invoked each time a label is modified in a PR, even for labels that are not related to CI behavior. This is not efficient for CI and implies that this mechanism does not scale well if several of these CI option labels are added over time. * Experiments were run to filter based in label inside the GitHub Action itself. This *only* worked if a CI related label is modified. If a non-CI related label is modified, the GitHub Action is skipped, and the result is PASS even if the result of the last run was FAIL. This would allow merges of PRs with known issues. * The other option is to remove 'labeled' and 'unlabeled' events from the GitHub Action. This can work, but has an unexpected behavior from the maintainer perspective. Setting a label does not re-run CI. Instead, the label has to be set, then either the PR must be closed and re-opened, or additional commits have to be added, or commits must be force pushed in order for the CI check to be re-run with the label changes. This also implies that that PASS/FAIL state of the PR can be out of sync with the current label settings if those additional actions are not performed after the labels are modified. Commit Message Tag Experiments ============================== An experiment was done to use a commit message tag/value to disable the multiple package check on only the commits that contain a matching tag/value. The tag/value used is: Continuous-Integration-Options: PatchCheck.ignore-multi-package Description Result Tag Present PR Link ------------------ ------- ----------- ---------------------------------------- Modify 1 package PASS NO https://github.com/mdkinney/edk2/pull/30 Modify 2 packages FAIL NO https://github.com/mdkinney/edk2/pull/28 Modify 2 packages PASS YES https://github.com/mdkinney/edk2/pull/29 Main Branch: * https://github.com/mdkinney/edk2/tree/CommitMessageCiFilter Patch Check with Continuous-Integration-Options tag parsing * https://github.com/mdkinney/edk2/blob/CommitMessageCiFilter/BaseTools/Scripts/PatchCheck.py GitHub Action File: * https://github.com/mdkinney/edk2/blob/CommitMessageCiFilter/.github/workflows/patchcheck.yml The expected results were achieved when adding the tag/value to the commit message of a commit that modifies multiple packages changes the PR from FAIL->PASS. The disadvantages of this approach are: 1) Adds CI specific tags to commit messages and commit history 2) Requires author or maintainer to update commit message with a CI specific tag/value if a multiple package check fails and is considered a false positive. The advantages of this approach are: 1) The scope of the tag/value is a single commit. Not the entire PR. 2) The information about what CI checks to skip are part of the commit message and the commit history so the information can be reused if needed in downstream CI. 3) The information is in the commit message, so it is available for both PR CI checks and 'push' CI checks. 4) No new behavior from a maintainer perspective on how/when CI checks are run and when the status if CI checks are available. ================================================================== > -----Original Message----- > From: Kinney, Michael D <michael.d.kin...@intel.com> > Sent: Wednesday, February 14, 2024 4:49 PM > To: Ard Biesheuvel <a...@kernel.org>; devel@edk2.groups.io > Cc: Leif Lindholm <quic_llind...@quicinc.com>; Rebecca Cran > <rebe...@bsdio.com>; Liming Gao <gaolim...@byosoft.com.cn>; Feng, Bob C > <bob.c.f...@intel.com>; Chen, Christine <yuwei.c...@intel.com>; Michael > Kubacki <mikub...@linux.microsoft.com>; Kinney, Michael D > <michael.d.kin...@intel.com> > Subject: RE: [edk2-devel] [Patch 1/1] BaseTools/Scripts/PatchCheck: > Error if commit modifies multiple packages > > Hi Ard, > > I agree we do not want to code to be worse. Which I interpret > in this context as introducing extra commits that reduce the > readability of the commits. > > We also need to consider the ease of review of patches and clear > responsibility for providing reviews. Especially when we have > different reviewers/maintainers for different packages. It makes > it easier to review a patch when patches do not cross package > boundaries. Otherwise, what does a Reviewed-by mean if you > only have context for a portion of the patch? Does that mean > that reviewer is confident in all the changes including those > they may not have any experience with? > > That means there is some tension between readability of commits > and code review of commits. > > The other topic brought up in this discussion is bisect. > > I understand the value of bisect to help identify the source > of a bug or behavior change. > > However, the current EDK II CI system does not test at the > granularity of single commits. Instead, it tests the entire > patch series in a PR. > > I have also observed some maintainers combining multiple > patch series into a single PR. > > I also suspect that there are many patch series in the edk2 > commit history that would not work if bisect was run across > them today. > > What this means in practice with the current edk2 repository > history is that we do not know if every commit can be built > and run. We only know that the EDK II CI tests that are run > against PRs have passed. > > We do expect build/run at the granularity of a PR merge today. > However, there is no information in the git history to know > where the PR merges occur. If that extra bit of information > was available, then bisect could select the commits at PR > merge boundaries to look for bug/behavior change. The history > of merged PRs is in github using the following query. > > https://github.com/tianocore/edk2/pulls?q=is%3Apr+is%3Aclosed+is%3Amerg > ed > > I imagine the set of sha hashes at PR boundaries could be extracted > from this query and bisect could select from that set of sha hashes > that are a subset of the total set of sha hashes. Biset would > identify the specific PR merge where the bug or behavior change was > introduced. > > Given the current state of the edk2 repo history, the concern about > splitting up commits on package boundaries that may cause a bisect > failure at a commit boundary within a single patch series may not be > something that needs to be considered. > > A refinement to the proposal is to require patches to not cross > package boundaries and authors simply need to split into multiple > commits based on package boundaries without considering bisect within > a single patch series. That prevents adding extra patches that make > the changes hard to understand and has the additional benefit of > making the patch series easier to review and clear ownership for > reviewing each patch. > > Best regards, > > Mike > > > -----Original Message----- > > From: Ard Biesheuvel <a...@kernel.org> > > Sent: Wednesday, February 14, 2024 3:48 PM > > To: devel@edk2.groups.io; Kinney, Michael D > > <michael.d.kin...@intel.com> > > Cc: Leif Lindholm <quic_llind...@quicinc.com>; Rebecca Cran > > <rebe...@bsdio.com>; Liming Gao <gaolim...@byosoft.com.cn>; Feng, Bob > C > > <bob.c.f...@intel.com>; Chen, Christine <yuwei.c...@intel.com>; > Michael > > Kubacki <mikub...@linux.microsoft.com> > > Subject: Re: [edk2-devel] [Patch 1/1] BaseTools/Scripts/PatchCheck: > > Error if commit modifies multiple packages > > > > On Thu, 15 Feb 2024 at 00:27, Michael D Kinney > > <michael.d.kin...@intel.com> wrote: > > > > > > Hi Ard, > > > > > > Using commit message does provide granularity down to a single > > commit, > > > which may be better than PR label which applies across all commits > > > in a series. > > > > > > However, a flag in the commit message can be set by the author who > > may > > > not be a maintainer and maintainers would then need to review > commit > > > messages for incorrect use of those flags. > > > > > > Only maintainers/admins can set labels, so from a permission > > management > > > perspective the PR label has an advantage. > > > > > > Additional comments below. There are ways to support bisect and > meet > > > the proposed checks. The suggestion uses techniques that Laszlo > > helped > > > me with in the past when working on issues like there. I have seen > > more > > > complex scenarios than the examples listed below, and have been > able > > to > > > figure out a path through. > > > > > > I would agree it is extra work to think about these when working on > > > the code changes and extra work to reformulate the patches when > these > > > conditions are encountered. > > > > > > I just want to be clear on the objections. It is not about if the > > patches > > > can be organized to follow this proposal. The objection is about > the > > > extra work required to reformulate patches. > > > > > > > No. > > > > The objection is fundamentally about having to appease the CI even if > > doing so is unreasonable. I don't mind a bit of extra work. I do mind > > having to make code changes that make the code worse just to tick a > CI > > box. (This is why I disabled uncrustify for the ARM packages: many > > header files which were perfectly legible before were converted into > a > > jumble of alphabet soup because uncrustify removed all of the > > indentation) > > > > It is about having the discretion to deviate from the rules in the > odd > > case where the cure is worse than the disease. -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#115566): https://edk2.groups.io/g/devel/message/115566 Mute This Topic: https://groups.io/mt/104345509/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-