New mklog script

2020-05-15 Thread Martin Liška

Hi.

Since we moved to git world and we're in the preparation for ChangeLog messages
being in git commit messages, I think it's the right time to also simplify mklog
script.

I'm sending a new version (which should eventually replace contrib/mklog and 
contrib/mklog.pl).
Changes made in the version:

- the script uses unifdiff - it rapidly simplifies parsing of the '+-!' lines 
that is done
  in contrib/mklog
- no author nor date stamp is used - that all can be get from git
- --inline option is not supported - I don't see a use-case for it now
- the new script has a unit tests (just few of them for now)

I compares results in between the old Python script for last 80 commits and 
it's very close,
in some cases it does even better.

I'm planning to maintain and improve the script for the future.

Thoughts?
Martin
>From 9fa5d13856f0f5ba153801baf57d4a732829f609 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 15 May 2020 00:44:07 +0200
Subject: [PATCH] Add mklog-ng.py and gcc-mklog git alias.

contrib/ChangeLog:

	* gcc-git-customization.sh: Add gcc-mklog alias.
	* mklog_ng.py: New file.
	* test_mklog_ng.py: New file.
---
 contrib/gcc-git-customization.sh |   2 +
 contrib/mklog_ng.py  | 192 +++
 contrib/test_mklog_ng.py | 158 +
 3 files changed, 352 insertions(+)
 create mode 100755 contrib/mklog_ng.py
 create mode 100755 contrib/test_mklog_ng.py

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index a932bf8c06a..b7b97327be3 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -25,6 +25,8 @@ git config alias.svn-rev '!f() { rev=$1; shift; git log --all --grep="^From-SVN:
 git config alias.gcc-descr \!"f() { if test \${1:-no} = --full; then c=\${2:-master}; r=\$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-,r,p'); expr match \${r:-no} '^r[0-9]\\+\$' >/dev/null && r=\${r}-0-g\$(git rev-parse \${2:-master}); else c=\${1:-master}; r=\$(git describe --all --match 'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)-\\([0-9]\\+\\)-g[0-9a-f]*\$,r\\2-\\3,p;s,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)\$,r\\2-0,p'); fi; if test -n \$r; then o=\$(git config --get gcc-config.upstream); rr=\$(echo \$r | sed -n 's,^r\\([0-9]\\+\\)-[0-9]\\+\\(-g[0-9a-f]\\+\\)\\?\$,\\1,p'); if git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$rr >/dev/null; then m=releases/gcc-\$rr; else m=master; fi; git merge-base --is-ancestor \$c \${o:-origin}/\$m && \echo \${r}; fi; }; f"
 git config alias.gcc-undescr \!"f() { o=\$(git config --get gcc-config.upstream); r=\$(echo \$1 | sed -n 's,^r\\([0-9]\\+\\)-[0-9]\\+\$,\\1,p'); n=\$(echo \$1 | sed -n 's,^r[0-9]\\+-\\([0-9]\\+\\)\$,\\1,p'); test -z \$r && echo Invalid id \$1 && exit 1; h=\$(git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$r); test -z \$h && h=\$(git rev-parse --verify --quiet \${o:-origin}/master); p=\$(git describe --all --match 'basepoints/gcc-'\$r \$h | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+-\\([0-9]\\+\\)-g[0-9a-f]*\$,\\2,p;s,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+\$,0,p'); git rev-parse --verify \$h~\$(expr \$p - \$n); }; f"
 
+git config alias.gcc-mklog '!f() { "`git rev-parse --show-toplevel`/contrib/mklog_ng.py" $@; } ; f'
+
 # Make diff on MD files use "(define" as a function marker.
 # Use this in conjunction with a .gitattributes file containing
 # *.mddiff=md
diff --git a/contrib/mklog_ng.py b/contrib/mklog_ng.py
new file mode 100755
index 000..a67fc007759
--- /dev/null
+++ b/contrib/mklog_ng.py
@@ -0,0 +1,192 @@
+#!/usr/bin/env python3
+
+# Copyright (C) 2020 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING.  If not, write to
+# the Free Software Foundation, 51 Franklin Street, Fifth Floor,
+# Boston, MA 02110-1301, USA.
+
+# This script parses a .diff file generated with 'diff -up' or 'diff -cp'
+# and adds a skeleton ChangeLog file to the file. It does not try to be
+# too smart when parsing function names, but it produces a reasonable
+# approximation.
+#
+# Author: Martin Liska 
+
+import argparse
+import os
+import re
+import sys
+
+from unidiff import PatchSet
+
+pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?PPR [a-z+-]+\/[0-9]+)')
+identifier_regex = re.compile(r'^([a-zA-Z0-9_#].*)')
+comment_re

Re: ChangeLog files - server and client scripts

2020-05-15 Thread Martin Liška

On 5/14/20 6:47 PM, Joseph Myers wrote:

On Thu, 14 May 2020, Martin Liška wrote:


On 5/13/20 7:53 PM, Joseph Myers wrote:

On Wed, 13 May 2020, Martin Liška wrote:


I'm sending the gcc-changelog relates scripts which should be added to
contrib
folder. The patch contains:
- git_check_commit.py - checking script that verifies git message format


We need a documentation patch to contribute.html or gitwrite.html that
describes the exact commit message format being used.


Sure, I'm sending patch for that.


Thanks.  There are references to author timestamps there.  The date in a
ChangeLog entry should always be a commit timestamp, not an author one, so
author timestamps present either in commit messages or in the git commit
metadata should be ignored, with only the committer timestamps from the
git commit metadata being used when generating ChangeLog files.


You are fully right, a committer date is what should be used.
Fixed in the documentation, note that the scripts use committed date.

Martin

diff --git a/htdocs/codingconventions.html b/htdocs/codingconventions.html
index f4732ef6..d2e73962 100644
--- a/htdocs/codingconventions.html
+++ b/htdocs/codingconventions.html
@@ -112,9 +112,14 @@ maintained and kept up to date.  In particular:
 
 ChangeLogs
 
-GCC requires ChangeLog entries for documentation changes; for the web
-pages (apart from java/ and libstdc++/) the CVS
-commit logs are sufficient.
+
+ChangeLog entries are part of git commit messages and are automatically put
+into a corresponding ChangeLog file.  A ChangeLog template can be easily generated
+with ./contrib/mklog script.  GCC offers a checking script that
+verifies a proper ChangeLog formatting (see git gcc-verify git alias).
+for a particular git commit.  The checking script covers most commonly used ChangeLog
+formats and the following paragraphs explain what it supports.
+
 
 See also what the http://www.gnu.org/prep/standards_toc.html";>GNU Coding
@@ -124,19 +129,95 @@ in comments rather than the ChangeLog, though a single line overall
 description of the changes may be useful above the ChangeLog entry for
 a large batch of changes.
 
-For changes that are ported from another branch, we recommend to
-use a single entry whose body contains a verbatim copy of the original
-entries describing the changes on that branch, possibly preceded by a
-single-line overall description of the changes.
+Components
+
+
+git_description - a leading text with git commit description
+committer_timestamp - line with timestamp and an author name and email (2 spaces before and after name) 
+example: 2020-04-23␣␣Martin Liska␣␣
+additional_author - line with additional commit author name and email (starting with a tabular and 4 spaces) 
+example: \tMartin Liska␣␣
+changelog_location - a location to a ChangeLog file 
+supported formats: a/b/c/ChangeLog, a/b/c/ChangeLog:, a/b/c/ (where ChangeLog file lives in the folder), \ta/b/c/ and a/b/c
+pr_entry - bug report reference 
+example: \tPR component/12345
+changelog_file - a modified file mentined in a ChangeLog:
+supported formats: \t* a/b/c/file.c:, \t* a/b/c/file.c (function):, \t* a/b/c/file1.c, a/b/c/file2.c:
+changelog_file_comment - line that follows a changelog_file with description of changes in the file;
+must start with \t
+co_authored_by - https://help.github.com/en/github/committing-changes-to-your-project/creating-a-commit-with-multiple-authors";>GitHub format for a Co-Authored-By
+
+
+Format rules
+
+
+git_description - optional; ends right before one of the other compoments is found
+committer_timestamp - optional; when found before a changelog_file, then it is added
+to each changelog entry
+additional_author - optional
+changelog_location - optional; parser attempts to identify ChangeLog file based
+on modified files; $changelog_location belonging to a different ChangeLog must
+be separated with an empty line
+pr_entry - optional; can contain any number of PR entries
+changelog_file - each changelog_location must contain at least one file
+changelog_file_comment - optional
+co_authored_by - optional, can contain more than one
+
+
+Documented behaviour
+
+
+a missing changelog_location file location can be deduced based on group of changelog_files
+script automatically generates missing "New file." entries for files that are added in a commit
+changed files that are not mentioned in a ChangeLog file generate an error
+similarly for unchanged files that are mentioned in a ChangeLog file
+a commit author and committer date stamp can be automatically deduced from a git commit - we recommend to use it
+co_authored_by is added to each ChangeLog entry
+a PR component is checked against list of valid components
+ChangeLog files, DATESTAMP, BASE-VER and DEV-PHASE can be modified only separately from other file

Re: New mklog script

2020-05-15 Thread David Malcolm via Gcc
On Fri, 2020-05-15 at 10:59 +0200, Martin Liška wrote:
> Hi.
> 
> Since we moved to git world and we're in the preparation for
> ChangeLog messages
> being in git commit messages, I think it's the right time to also
> simplify mklog
> script.
> 
> I'm sending a new version (which should eventually replace
> contrib/mklog and contrib/mklog.pl).
> Changes made in the version:
> 
> - the script uses unifdiff - it rapidly simplifies parsing of the '+-
> !' lines that is done
>in contrib/mklog
> - no author nor date stamp is used - that all can be get from git
> - --inline option is not supported - I don't see a use-case for it
> now
> - the new script has a unit tests (just few of them for now)
> 
> I compares results in between the old Python script for last 80
> commits and it's very close,
> in some cases it does even better.
> 
> I'm planning to maintain and improve the script for the future.
> 
> Thoughts?
> Martin

> +class TestMklog(unittest.TestCase):
> +def test_macro_definition(self):
> +changelog = generate_changelog(PATCH1)
> +assert changelog == EXPECTED1
> +
> +def test_changed_argument(self):
> +changelog = generate_changelog(PATCH2)
> +assert changelog == EXPECTED2
> +
> +def test_enum_and_struct(self):
> +changelog = generate_changelog(PATCH3)
> +assert changelog == EXPECTED3
> +
> +def test_no_function(self):
> +changelog = generate_changelog(PATCH3, True)
> +assert changelog == EXPECTED3B

Use self.assertEqual(a, b) rather than assert a == b, so that if it
fails you get a multiline diff:

e.g.:

import unittest

class TestMklog(unittest.TestCase):
def test_macro_definition(self):
self.assertEqual('''
first
second
third''', '''
first
SECOND
third''')

unittest.main()


has this output:

F
==
FAIL: test_macro_definition (__main__.TestMklog)
--
Traceback (most recent call last):
  File "/tmp/foo.py", line 11, in test_macro_definition
third''')
AssertionError: '\nfirst\nsecond\nthird' != '\nfirst\nSECOND\nthird'
  
  first
- second
+ SECOND
  third

--
Ran 1 test in 0.000s

FAILED (failures=1)

which is much easier to debug than the output from assert a == b, which
is just:

F
==
FAIL: test_macro_definition (__main__.TestMklog)
--
Traceback (most recent call last):
  File "/tmp/foo.py", line 11, in test_macro_definition
third''')
AssertionError

--
Ran 1 test in 0.000s

FAILED (failures=1)



Re: New mklog script

2020-05-15 Thread Martin Liška

On 5/15/20 12:58 PM, David Malcolm wrote:

On Fri, 2020-05-15 at 10:59 +0200, Martin Liška wrote:

Hi.

Since we moved to git world and we're in the preparation for
ChangeLog messages
being in git commit messages, I think it's the right time to also
simplify mklog
script.

I'm sending a new version (which should eventually replace
contrib/mklog and contrib/mklog.pl).
Changes made in the version:

- the script uses unifdiff - it rapidly simplifies parsing of the '+-
!' lines that is done
in contrib/mklog
- no author nor date stamp is used - that all can be get from git
- --inline option is not supported - I don't see a use-case for it
now
- the new script has a unit tests (just few of them for now)

I compares results in between the old Python script for last 80
commits and it's very close,
in some cases it does even better.

I'm planning to maintain and improve the script for the future.

Thoughts?
Martin



+class TestMklog(unittest.TestCase):
+def test_macro_definition(self):
+changelog = generate_changelog(PATCH1)
+assert changelog == EXPECTED1
+
+def test_changed_argument(self):
+changelog = generate_changelog(PATCH2)
+assert changelog == EXPECTED2
+
+def test_enum_and_struct(self):
+changelog = generate_changelog(PATCH3)
+assert changelog == EXPECTED3
+
+def test_no_function(self):
+changelog = generate_changelog(PATCH3, True)
+assert changelog == EXPECTED3B


Thank you David for review.

However I see the same output for both operator== and assertEqual. Probably
because of usage of pytest version 4?

assertEqual:

$ pytest contrib/test_mklog_ng.py
Test session starts (platform: linux, Python 3.8.2, pytest 4.6.9, pytest-sugar 
0.9.3)
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False 
min_rounds=5 min_time=0.05 max_time=1.0 calibration_precision=10 
warmup=False warmup_iterations=10)
rootdir: /home/marxin/Programming/gcc
plugins: xdist-1.32.0, sugar-0.9.3, forked-1.1.3, benchmark-3.2.3, 
aspectlib-1.5.0, cov-2.8.1, flake8-1.0.5
collecting ...
 contrib/test_mklog_ng.py ✓ 


  25% ██▌


 TestMklog.test_enum_and_struct 


self = 

def test_enum_and_struct(self):
changelog = generate_changelog(PATCH3)

  self.assertEqual(changelog, EXPECTED3)

E   AssertionError: 'libc[23 chars]clude/cpplib.h (enum c_lang):\n\t(struct 
cpp_options):\n\n' != 'libc[23 chars]clude/cppli22b.h (enum c_lang):\n\t(struct 
cpp_optio44ns):\n\n'
E libcpp/ChangeLog:
E
E   -   * include/cpplib.h (enum c_lang):
E   +   * include/cppli22b.h (enum c_lang):
E   ?  ++
E   -   (struct cpp_options):
E   +   (struct cpp_optio44ns):
E   ?++

contrib/test_mklog_ng.py:154: AssertionError

operator==:

pytest contrib/test_mklog_ng.py
Test session starts (platform: linux, Python 3.8.2, pytest 4.6.9, pytest-sugar 
0.9.3)
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False 
min_rounds=5 min_time=0.05 max_time=1.0 calibration_precision=10 
warmup=False warmup_iterations=10)
rootdir: /home/marxin/Programming/gcc
plugins: xdist-1.32.0, sugar-0.9.3, forked-1.1.3, benchmark-3.2.3, 
aspectlib-1.5.0, cov-2.8.1, flake8-1.0.5
collecting ...
 contrib/test_mklog_ng.py ✓ 


  25% ██▌


 TestMklog.test_enum_and_struct 


self = 

def test_enum_and_struct(self):
changelog = generate_changelog(PATCH3)

  assert changelog == EXPECTED3

E   AssertionError: assert 'libcpp/Chang...options):\n\n' == 
'libcpp/Change...tio44ns):\n\n'
E   libcpp/ChangeLog:
E
E - * include/cpplib.h (enum c_lang):
E + * include/cppli22b.h (enum c_lang):
E ?++
E - (struct cpp_options):
E + (struct cpp_optio44ns):...
E
E ...Full output truncated (3 lines hidden), use '-vv' to show

Martin



Use self.assertEqual(a, b) rather than assert a == b, so that if it
fails you get a multiline diff:

e.g.:

import unittest

class 

Re: New mklog script

2020-05-15 Thread Marek Polacek via Gcc
On Fri, May 15, 2020 at 10:59:56AM +0200, Martin Liška wrote:
> Hi.
> 
> Since we moved to git world and we're in the preparation for ChangeLog 
> messages
> being in git commit messages, I think it's the right time to also simplify 
> mklog
> script.
> 
> I'm sending a new version (which should eventually replace contrib/mklog and 
> contrib/mklog.pl).
> Changes made in the version:
> 
> - the script uses unifdiff - it rapidly simplifies parsing of the '+-!' lines 
> that is done
>   in contrib/mklog

Nice!

> - no author nor date stamp is used - that all can be get from git

This is good.

> - --inline option is not supported - I don't see a use-case for it now

I actually use mklog -i all the time.  But I can work around it if it
disappears.

--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



Re: New mklog script

2020-05-15 Thread David Malcolm via Gcc
On Fri, 2020-05-15 at 13:20 +0200, Martin Liška wrote:
> On 5/15/20 12:58 PM, David Malcolm wrote:
> > On Fri, 2020-05-15 at 10:59 +0200, Martin Liška wrote:
> > > Hi.
> > > 
> > > Since we moved to git world and we're in the preparation for
> > > ChangeLog messages
> > > being in git commit messages, I think it's the right time to also
> > > simplify mklog
> > > script.
> > > 
> > > I'm sending a new version (which should eventually replace
> > > contrib/mklog and contrib/mklog.pl).
> > > Changes made in the version:
> > > 
> > > - the script uses unifdiff - it rapidly simplifies parsing of the
> > > '+-
> > > !' lines that is done
> > > in contrib/mklog
> > > - no author nor date stamp is used - that all can be get from git
> > > - --inline option is not supported - I don't see a use-case for
> > > it
> > > now
> > > - the new script has a unit tests (just few of them for now)
> > > 
> > > I compares results in between the old Python script for last 80
> > > commits and it's very close,
> > > in some cases it does even better.
> > > 
> > > I'm planning to maintain and improve the script for the future.
> > > 
> > > Thoughts?
> > > Martin
> > > +class TestMklog(unittest.TestCase):
> > > +def test_macro_definition(self):
> > > +changelog = generate_changelog(PATCH1)
> > > +assert changelog == EXPECTED1
> > > +
> > > +def test_changed_argument(self):
> > > +changelog = generate_changelog(PATCH2)
> > > +assert changelog == EXPECTED2
> > > +
> > > +def test_enum_and_struct(self):
> > > +changelog = generate_changelog(PATCH3)
> > > +assert changelog == EXPECTED3
> > > +
> > > +def test_no_function(self):
> > > +changelog = generate_changelog(PATCH3, True)
> > > +assert changelog == EXPECTED3B
> 
> Thank you David for review.
> 
> However I see the same output for both operator== and assertEqual.
> Probably
> because of usage of pytest version 4?

Ah, yes.  pytest does "magical" things with frame inspection IIRC to
scrape the locals out of the failing python stack frame.

Dave



Re: New mklog script

2020-05-15 Thread Martin Liška

On 5/15/20 2:42 PM, Marek Polacek wrote:

I actually use mklog -i all the time.  But I can work around it if it
disappears.


Ah, I can see a consumer.
There's an updated version that supports that.

For the future, will you still use the option? Wouldn't be better
to put the ChangeLog content directly to commit message? Note
that you won't have to copy the entries to a particular ChangeLog file.

Martin
>From d7d5e3aa7450449a8b0cb30d6bf485538990ea3f Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 15 May 2020 00:44:07 +0200
Subject: [PATCH] Add mklog-ng.py and gcc-mklog git alias.

contrib/ChangeLog:

	* gcc-git-customization.sh: Add gcc-mklog alias.
	* mklog_ng.py: New file.
	* test_mklog_ng.py: New file.
---
 contrib/gcc-git-customization.sh |   2 +
 contrib/mklog_ng.py  | 209 +++
 contrib/test_mklog_ng.py | 158 +++
 3 files changed, 369 insertions(+)
 create mode 100755 contrib/mklog_ng.py
 create mode 100755 contrib/test_mklog_ng.py

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index a932bf8c06a..b7b97327be3 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -25,6 +25,8 @@ git config alias.svn-rev '!f() { rev=$1; shift; git log --all --grep="^From-SVN:
 git config alias.gcc-descr \!"f() { if test \${1:-no} = --full; then c=\${2:-master}; r=\$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-,r,p'); expr match \${r:-no} '^r[0-9]\\+\$' >/dev/null && r=\${r}-0-g\$(git rev-parse \${2:-master}); else c=\${1:-master}; r=\$(git describe --all --match 'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)-\\([0-9]\\+\\)-g[0-9a-f]*\$,r\\2-\\3,p;s,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)\$,r\\2-0,p'); fi; if test -n \$r; then o=\$(git config --get gcc-config.upstream); rr=\$(echo \$r | sed -n 's,^r\\([0-9]\\+\\)-[0-9]\\+\\(-g[0-9a-f]\\+\\)\\?\$,\\1,p'); if git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$rr >/dev/null; then m=releases/gcc-\$rr; else m=master; fi; git merge-base --is-ancestor \$c \${o:-origin}/\$m && \echo \${r}; fi; }; f"
 git config alias.gcc-undescr \!"f() { o=\$(git config --get gcc-config.upstream); r=\$(echo \$1 | sed -n 's,^r\\([0-9]\\+\\)-[0-9]\\+\$,\\1,p'); n=\$(echo \$1 | sed -n 's,^r[0-9]\\+-\\([0-9]\\+\\)\$,\\1,p'); test -z \$r && echo Invalid id \$1 && exit 1; h=\$(git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$r); test -z \$h && h=\$(git rev-parse --verify --quiet \${o:-origin}/master); p=\$(git describe --all --match 'basepoints/gcc-'\$r \$h | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+-\\([0-9]\\+\\)-g[0-9a-f]*\$,\\2,p;s,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+\$,0,p'); git rev-parse --verify \$h~\$(expr \$p - \$n); }; f"
 
+git config alias.gcc-mklog '!f() { "`git rev-parse --show-toplevel`/contrib/mklog_ng.py" $@; } ; f'
+
 # Make diff on MD files use "(define" as a function marker.
 # Use this in conjunction with a .gitattributes file containing
 # *.mddiff=md
diff --git a/contrib/mklog_ng.py b/contrib/mklog_ng.py
new file mode 100755
index 000..8dca6dbeef0
--- /dev/null
+++ b/contrib/mklog_ng.py
@@ -0,0 +1,209 @@
+#!/usr/bin/env python3
+
+# Copyright (C) 2020 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING.  If not, write to
+# the Free Software Foundation, 51 Franklin Street, Fifth Floor,
+# Boston, MA 02110-1301, USA.
+
+# This script parses a .diff file generated with 'diff -up' or 'diff -cp'
+# and adds a skeleton ChangeLog file to the file. It does not try to be
+# too smart when parsing function names, but it produces a reasonable
+# approximation.
+#
+# Author: Martin Liska 
+
+import argparse
+import os
+import re
+import sys
+import tempfile
+
+from unidiff import PatchSet
+
+pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?PPR [a-z+-]+\/[0-9]+)')
+identifier_regex = re.compile(r'^([a-zA-Z0-9_#].*)')
+comment_regex = re.compile(r'^\/\*')
+struct_regex = re.compile(r'^((class|struct|union|enum)\s+[a-zA-Z0-9_]+)')
+macro_regex = re.compile(r'#\s*(define|undef)\s+([a-zA-Z0-9_]+)')
+super_macro_regex = re.compile(r'^DEF[A-Z0-9_]+\s*\(([a-zA-Z0-9_]+)')
+fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
+template_and_param_regex = re.compile(r'<[^<>]*>')
+
+function_extensions = set(['.c', '.cpp', '.C', '.

Re: New mklog script

2020-05-15 Thread Marek Polacek via Gcc
On Fri, May 15, 2020 at 03:12:27PM +0200, Martin Liška wrote:
> On 5/15/20 2:42 PM, Marek Polacek wrote:
> > I actually use mklog -i all the time.  But I can work around it if it
> > disappears.
> 
> Ah, I can see a consumer.
> There's an updated version that supports that.
> 
> For the future, will you still use the option? Wouldn't be better
> to put the ChangeLog content directly to commit message? Note
> that you won't have to copy the entries to a particular ChangeLog file.

The way I do it is to generate a patch using format-patch, use mklog -i
on it, then add the ChangeLog entry to the commit message via commit --amend.

Anything that has to do with ChangeLogs is pointless make-work, so the less
I have to do, the better.  ;-)

Marek



Re: New mklog script

2020-05-15 Thread Martin Sebor via Gcc

On 5/15/20 2:59 AM, Martin Liška wrote:

Hi.

Since we moved to git world and we're in the preparation for ChangeLog 
messages
being in git commit messages, I think it's the right time to also 
simplify mklog

script.

I'm sending a new version (which should eventually replace contrib/mklog 
and contrib/mklog.pl).

Changes made in the version:

- the script uses unifdiff - it rapidly simplifies parsing of the '+-!' 
lines that is done

   in contrib/mklog
- no author nor date stamp is used - that all can be get from git
- --inline option is not supported - I don't see a use-case for it now
- the new script has a unit tests (just few of them for now)

I compares results in between the old Python script for last 80 commits 
and it's very close,

in some cases it does even better.

I'm planning to maintain and improve the script for the future.

Thoughts?


It's pretty nice.  I have a script of my own that does the same thing
in a slightly different way.  Here's an example of its output:
https://gcc.gnu.org/pipermail/gcc-patches/attachments/20200323/5437db5a/attachment-0001.bin

I find this format more helpful for the reasons below so unless your
script can be tweaked to do something similar I'd like to be able to
continue to use mine going forward with the new infrastructure.

As for my comments on mklog_ng.py: In the one test I did the script
produced a single long ChangeLog entry with all the files in the diff
I gave it, tests and all, in alphabetical order.  The script fills in
"New test." for new tests.  The rest has to be edited as one would
expect.

I would find the output easier to work with if it a) grouped files by
"subsystem" corresponding to each ChangeLog directory (and if it also
identified each subsystem), b) put the testsuite section last, and
(as a bonus) c) grouped all new files in each section together.

First, I find this logical grouping helpful in thinking about how
the changes are structured (e.g., would it make sense to restructure
them or break things up to reduce coupling and make review easier),
and whom they need to be reviewed by.

Second, this is the grouping I'm already used to from my own script
(so YMMV here of course).

Finally, my script also looks up bugs in Bugzilla and adds a line with
each bug number and its Summary at the top of the patch.  This helps me
double-check the spelling of the bug id(s) in case I transpose digits
etc.

Martin

PS My script modifies the patch file in place: it adds the ChangeLog
section if it doesn't exist yet, but it doesn't do anything it does.
I'd love for it to check the existing ChangeLog if it exists and
update it when it finds differences between it and the latest patch
that aren't reflected there.

Without this, each time a patch changes I have to review the entry
and update it as necessary.  That makes it too easy to miss things.


Re: New mklog script

2020-05-15 Thread Martin Liška

On 5/15/20 3:22 PM, Marek Polacek wrote:

On Fri, May 15, 2020 at 03:12:27PM +0200, Martin Liška wrote:

On 5/15/20 2:42 PM, Marek Polacek wrote:

I actually use mklog -i all the time.  But I can work around it if it
disappears.


Ah, I can see a consumer.
There's an updated version that supports that.

For the future, will you still use the option? Wouldn't be better
to put the ChangeLog content directly to commit message? Note
that you won't have to copy the entries to a particular ChangeLog file.


The way I do it is to generate a patch using format-patch, use mklog -i
on it, then add the ChangeLog entry to the commit message via commit --amend.


Hmm, you can do much better with:

$ git diff | ./contrib/mklog > changelog && git commit -a -t changelog

Or for an already created commit you can do:

$ git diff HEAD~ | ./contrib/mklog > changelog && git commit -a --amend -e -F 
changelog

That said, I believe usage of -i is legacy.

Martin



Anything that has to do with ChangeLogs is pointless make-work, so the less
I have to do, the better.  ;-)

Marek





gcc-9-20200515 is now available

2020-05-15 Thread GCC Administrator via Gcc
Snapshot gcc-9-20200515 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/9-20200515/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 9 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-9 
revision aa237c6dceeeab1455d83a9063ab87afb4a18082

You'll find:

 gcc-9-20200515.tar.xzComplete GCC

  SHA256=05de09d64c91d2474b833928a956e11af5b49d92af01a86a9ad4dd723294fabc
  SHA1=904878c2815621a6a5436779bb1afae3bdb7ea5d

Diffs from 9-20200508 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: dejagnu version update?

2020-05-15 Thread Mike Stump via Gcc
On May 14, 2020, at 11:11 AM, Tom Tromey  wrote:
> 
>> "Rob" == Rob Savoye  writes:
> 
> Rob>   Not that team, the folks I talked to thought I was crazy for wanting
> Rob> to refactor it. :-)
> 
> I don't think refactoring dejagnu is crazy, but I think it's pretty hard
> to imagine rewriting the gdb test suite in Python.  It's 260 KLOC.

So, TCL is subject to being easy to parse, and if you can reliable move each 
feature to a new system with a re-engineering style system that is complete 
enough to handle converting code from TCL to python, for example; one merely 
needs to complete the work for a few of the odd corner cases one might use.  At 
some point, I do think as an industry, we do need tools to migrate code from 
system to system, updating the language used.  C++ may well fall outside of the 
possibility for the next 30-90 years, but TCL, lisp and python might not be so 
unreasonable in a shorter timeframe.  I one saw someone convert TCL into lisp I 
think it was, which I thought was neat.

One day, would be nice if language implementors and designers implemented 
conversions into and out of their language from _the_ re-engineering toolkit as 
they did their language.  10 or 30 years after they decide, oh, no more support 
for you, you're dead, you can then migrate to the next new wiz bang language.

Yes, I say this all, even knowing that people can't even do the python 2.7 -> 
3.x conversion program yet.

Anyway, love to have software that can move code wholesale.  Love to move the 
testsuite into a new language.

Re: dejagnu version update?

2020-05-15 Thread Rob Savoye
On 5/15/20 6:22 PM, Mike Stump wrote:

> Anyway, love to have software that can move code wholesale.  Love to move the 
> testsuite into a new language.

  All it needs is funding. :-) What GDB needs is expect, not Tcl. Most
of the GDB testsuite is just expect pattern matching from the shell.
That's the entire reason I choose Tcl as it already had expect support.
Expect was necessary functionality for GDB testing. For GCC &  Binutils,
Expect is only used for remote testing support.

  As it's possible to embed Tcl in other programs, the idea was to use
an embedded Tcl interpreter when needed during a transition period. It's
mostly just the framework itself that would need to be refactored into
Python. There is also a large amount of code in gcc/testsuite that
should probably be in core DejaGnu too. That would be a large component
in analyzing existing code to write a true design doc. The best part is
now we have large toolchain testsuites to use to test DejaGnu changes.

 A one point we thought DejaGnu would be used for other projects, but I
think it's obviously that these days it's  only used for GNU toolchain
testing.

  I'm making progress on setting up a development environment to test
patches. I use my ABE tool to build toolchains, had to fix some bugs
(and add PI support) first.

- rob -
---
https://www.senecass.com