Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog

Martin Sebor via Gcc Wed, 16 Jun 2021 17:17:24 -0700

On 6/16/21 5:45 PM, Jason Merrill wrote:

On Wed, Jun 16, 2021 at 5:46 PM Martin Sebor <mse...@gmail.com<mailto:mse...@gmail.com>> wrote:


    On 6/16/21 2:49 PM, Jason Merrill wrote:
     > On 6/15/21 11:42 PM, Jason Merrill wrote:
     >> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc
    <gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>
     >> <mailto:gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>>> wrote:
     >>
     >>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
     >>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
     >>      >
     >>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
     >>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
     >>      >>>> My objection is to making our policies and tools more
     >> restrictive
     >>      >>>> than they need to be.  We shouldn't expect everyone to
    study
     >> whole
     >>      >>>> manuals just to figure out how to successfully commit a
     >> change (or
     >>      >>>> learn how to format it just the right way).  It should
    be easy.
     >>      >>>
     >>      >>> I agree, to some extent. But consistency is also good. The
     >>     conventions
     >>      >>> for GNU ChangeLog formatting exist for a reason, and so
    do the
     >>      >>> conventions for good Git commit messages.
     >>      >>>
     >>      >>>> Setting this discussion aside for a moment and using a
     >> different
     >>      >>>> example, the commit hook rejects commit messages that
    don't
     >> start
     >>      >>>> ChangeLog entries with tabs.  It also rejects commit
     >> messages that
     >>      >>>> don't list all the same test files as those changed by
    the
     >> commit
     >>      >>>> (and probably some others as well).  That's in my view
     >> unnecessary
     >>      >>>> when the hook could just replace the leading spaces with
     >> tabs and
     >>      >>>> automatically mention all the tests.
     >>      >>>>
     >>      >>>> I see this proposal as heading in the same direction.
     >> Rather than
     >>      >>>> making the script fix things up if we get them wrong
    it would
     >>     reject
     >>      >>>> the commit, requiring the user to massage the ChangeLog by
     >>     hand into
     >>      >>>> an unnecessarily rigid format.
     >>      >>>
     >>      >>> You cannot "fix things up" in a server-side receive hook,
     >> because
     >>      >>> changing the commit message would alter the commit
    hash, which
     >>     would
     >>      >>> require the committer to do a rebase to proceed. That
    breaks the
     >>      >>> expected behaviour and workflow of a git repo.
     >>      >>>
     >>      >>> You can use the scripts on the client side to verify
    your commit
     >>      >>> message before pushing, so you don't have to be surprised
     >> when the
     >>      >>> server rejects it.
     >>      >>
     >>      >> That sounds like a killer argument.  Do we have shared
     >> client-side
     >>      >> scripts that could fix things up for us, or are we each
    on our
     >> own
     >>      >> to write them?
     >>      >
     >>      > I hope I got your view wrong.  If not: the "scripts fixing
     >>      > things up for us" direction is flawed (compared to the
    "scripts
     >>      > rejecting bad formats"), unless offered as a non-default
    option;
     >>      > please don't proceed.
     >>      >
     >>      > Why?  For one, there'll always be bugs in the scripting.
     >>      > Mitigate those situations: while wrongly rejecting a
    commit is
     >>      > bad, wrongly "fixing things up" is worse, as a general rule.
     >>      > Better avoid that.  (There's probably a popular "pattern
    name"
     >>      > for what I try to describe.)
     >>
     >>     The word that comes to mind is Technophobia.  Is it wise to
    trust
     >>     compilers to transform programs from their source form into
     >>     executables?  What if there are bugs in either?  What about
    the OS?

>> The whole computer, or the Internet? Our cars?Fortunately, there's

     >>     more to gain than to lose by trusting automation.  If there
    weren't
     >>     human progress would be stuck sometime in the 1700's.
     >>
     >>     But we're not talking about anything anywhere that sophisticated
     >>     here: a sed script to copy and paste a piece of text in
     >>     the description of a change from one place to another.  It's
    been
     >>     done a few times before with more important data than
    ChangeLogs.
     >>
     >>
     >> git gcc-commit-mklog already automates most of the process.  It
    could
     >> also automate adding [PRxxxxx] to the first line.  Is that what
    you're
     >> asking for?
     >
     > Like, say:

    I don't think this solves the problem Xionghu Luo was asking about:
    https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html

Indeed, their problem was not mentioning the PR in the testcase, which ascript isn't going to fix.


    i.e., they did have a [PRnnnn] in the one line subject but not in
    their ChangeLog entries.  It also not clear if they used mklog.py
    at all.  IME, mklog.py already puts in a [PRnnnn] near the top of
    a patch if it finds in one of the tests.  Though it doesn't seem
    to put in the ChangeLog entries.  Odd.


It currently puts in

  PR comp/nnnnn

at the beginning of the ChangeLog entries; it used to put them in theentries for each ChangeLog file, but that changed in r12-771. My patchalso adds the [PRnnnn] to the subject line.


To say I'm not good at Python would be an understatement but I hacked
up the attached patch that:

1) extracts PR numbers from test names,
2) gets the component for each PR from Bugzilla,
3) adds the PR component/nnnnn to each ChangeLog

Running the hacked up script with -p on the patch for PR 100085 prints
the following:

Resolves:

PR 100085/target - Bad code for union transfer from __float128 to vectortypes


gcc/ChangeLog:

        PR 100085/target
        * config/rs6000/rs6000-p8swap.c (pattern_is_rotate64):
        (insn_is_load_p):
        (insn_is_swap_p):
        (quad_aligned_load_p):
        (const_load_sequence_p):
        (replace_swapped_aligned_load):
        (recombine_lvx_pattern):
        (recombine_stvx_pattern):

gcc/testsuite/ChangeLog:

        PR 100085/target
        * gcc.target/powerpc/float128-call.c:
        * gcc.target/powerpc/pr100085.c: New test.


Without -p it doesn't print the Resolves: line or the component name
in the ChangeLog entries.  This also mimics what my retired awk script
does.  What do you think?

    I was suggesting to make this (and the other things the commit
    hook rejects commit for) happen automatically by running a script
    at the same time as git commit.

    But to be clear, I'm not asking for anything for myself.  Although
    I use mklog.py I have my own script that does what I suggest that
    I could go back to.  I responded to this thread because I think
    these minute details could be automated for everyone's benefit.
    Before moving to Git we talked about doing much more, including
    automatically running a format/style checker on the patch and
    (IIRC) Jeff even wanted it to do some basic tweaks on its own
    (like replace spaces with tabs).

We could do various cleanup in the 'commit-msg' hook on the user side.Or, probably better, git gcc-verify could clean up some of the issues itfinds.


I'm not familiar with these.  Where should I look for them to learn
how to do this?

Martin

Add PRs to ChangeLog entries in output.

contrib/ChangeLog:

	* mklog.py (bugzilla_url): Add component to query.
	get_pr_titles): Remove stray debugging output.  Get
	components for PRs.  Return new PR array including components.
	(generate_changelog): Extract PRs from test names.
	Print a Resolves line before PRs.  Replace PRS with updated.
	Print all PRs in each ChangeLog entry.

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 5c93c707128..e6a418a2fb4 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -51,7 +51,7 @@ fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
 template_and_param_regex = re.compile(r'<[^<>]*>')
 md_def_regex = re.compile(r'\(define.*\s+"(.*)"')
 bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&;' \
-               'include_fields=summary'
+               'include_fields=component,summary'
 
 function_extensions = {'.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def', '.md'}
 
@@ -115,17 +115,22 @@ def sort_changelog_files(changed_file):
 
 
 def get_pr_titles(prs):
+    new_prs = []
     output = ''
     for pr in prs:
         pr_id = pr.split('/')[-1]
+        if pr_id == pr:
+          pr_id = pr.split(' ')[-1]
         r = requests.get(bugzilla_url % pr_id)
         bugs = r.json()['bugs']
         if len(bugs) == 1:
-            output += '%s - %s\n' % (pr, bugs[0]['summary'])
-            print(output)
+            comp = bugs[0]['component']
+            pr_id_comp = "PR " + pr_id + "/" + comp
+            new_prs.append (pr_id_comp)
+            output += 'PR %s/%s - %s\n' % (pr_id, comp, bugs[0]['summary'])
     if output:
         output += '\n'
-    return output
+    return output, new_prs
 
 
 def generate_changelog(data, no_functions=False, fill_pr_titles=False):
@@ -147,6 +152,12 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
 
         # Extract PR entries from newly added tests
         if 'testsuite' in file.path and file.is_added_file:
+            name = os.path.basename(file.path)
+            name = os.path.splitext(name)[0]
+            if name.startswith("pr"):
+                name = name[2:]
+                name = "PR " + name
+                prs.append(name)
             # Only search first ten lines as later lines may
             # contains commented code which a note that it
             # has not been tested due to a certain PR or DR.
@@ -166,22 +177,28 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
                         # Found dg-warning/dg-error line
                         break
 
-    if fill_pr_titles:
-        out += get_pr_titles(prs)
+    # Start with a blank line for a subject and description.
+    out += '\n'
 
-    # print list of PR entries before ChangeLog entries
-    if prs:
-        if not out:
-            out += '\n'
-        for pr in prs:
-            out += '\t%s\n' % pr
-        out += '\n'
+    # Append a Resolves: line with PRs and their titles and replace
+    # PRS with an array of <pr-id>/<component> in case it's not in
+    # that format.
+    if fill_pr_titles:
+        pr_titles, new_prs = get_pr_titles(prs)
+        if new_prs:
+            prs = new_prs
+        if pr_titles:
+            out += "Resolves:\n"
+            out += pr_titles
 
     # sort ChangeLog so that 'testsuite' is at the end
     for changelog in sorted(changelog_list, key=lambda x: 'testsuite' in x):
         files = changelogs[changelog]
         out += '%s:\n' % os.path.join(changelog, 'ChangeLog')
         out += '\n'
+        # Print list of PR entries before each ChangeLog entry.
+        for pr in prs:
+            out += '\t%s\n' % pr
         # new and deleted files should be at the end
         for file in sorted(files, key=sort_changelog_files):
             assert file.path.startswith(changelog)

Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog

Reply via email to