update-copyright request and patch...

Joel Brobecker Mon, 02 Jan 2012 19:39:45 -0800

Hello, and Happy New Year :-).

I am trying to use gnulib's update-copyright script to update all
of GDB's files. It's working quite beautifully, and it's quite fast
too. I only have a couple of issues:


  - C files: In GDB, the style for comments is to avoid the '*'
    at the start of new files. Eg:

        /* Copyright (C) 2000, 2001, 2002, 2003, 2004,
           2005 Free Software Foundation, Inc.  */

    update-copyright, on the other hand, transforms the "/*" prefix
    into " *"...

  - The script is not always working well with XML files,
    because it potentially repeats the start-of-comment prefix
    when wrapping lines, thus transforming...

        <!-- Copyright (C) 2000, 2001, 2002, 2003 Free Software Foundation, Inc.

        <!-- Copyright (C) 2000, 2001, 2002, 2003, 2012
        <!-- Free Software Foundation, Inc.

    (or not recognizing multi-line copyright blobs)

I would like to extend the script in gnulib to accomodate GDB in one
way or the other, and I was wondering if I could have a little guidance?
I have a prototype patch, which is attached. My perl fu is no longer
what it used to be, I am afraid...

What I did was modify the script to recognize a new environment variable
named MULTILINE_COMMENT_PREFIXES, which should contain a list of
strings, each separated by a new-line character. The script would
recognize each of these prefixes as starting a multi-line comment,
meaning that the next line should be spaces rather than that prefix.
Does that make some kind of sense?

For GDB, our script does would set it as follow:

    # A list of prefixes that start a multi-line comment.  These prefixes
    # should not be repeatead when wraping long lines.
    MULTILINE_COMMENT_PREFIXES='
    /*
    <!--
    {
    '
    export MULTILINE_COMMENT_PREFIXES

(the '{' is for Pascal).

If MULTILINE_COMMENT_PREFIXES is not defined, the behavior is preserved.

Thoughts?

Thank you,
-- 
Joel

PS: One of the suggestinos that Jim made was that we could probably
    mitigate the problem by using interface to group the years
    together.  This would help, and I would like GDB to move to that
    style anyways, but we were in the past updating the copyright year
    lazily, only when actually modifying the file. So not all years
    are going to be collapsed into one single interval. So we really
    need multi-line capabilities to work in our scenario...

commit ef3e3b0e670dd6606ccbfd34f9116279d151cac2
Author: Joel Brobecker <brobec...@adacore.com>
Date:   Mon Jan 2 16:58:49 2012 +0400

    Local changes to update-copyright.
    
    Not finalized yet - to be discussed with original authors.

diff --git a/gdb/gnulib/extra/update-copyright b/gdb/gnulib/extra/update-copyright
index d86a12b..f649d78 100755
--- a/gdb/gnulib/extra/update-copyright
+++ b/gdb/gnulib/extra/update-copyright
@@ -138,6 +138,22 @@ if (!$this_year || $this_year !~ m/^\d{4}$/)
     $this_year = $year + 1900;
   }
 
+# Handling of prefixes that start a multi-line comment.
+# In most cases when wrapping lines, the next line should start with
+# spaces rather than the same prefix.
+#
+# By default, we wrap '/*' into " *".
+my %multiline_prefixes = ('/*' => ' *');
+
+# Process the MULTILINE_COMMENT_PREFIXES.  They add and/or override
+# the defautlts above.
+for (split /^/, $ENV{MULTILINE_COMMENT_PREFIXES} || '')
+  {
+    next if (/^\s*$/);
+    s/\n+$//; # I do not understand why chomp($_); does not work here.
+    $multiline_prefixes{$_} = ' ' x length($_);
+  }
+
 # Unless the file consistently uses "\r\n" as the EOL, use "\n" instead.
 my $eol = /(?:^|[^\r])\n/ ? "\n" : "\r\n";
 
@@ -149,14 +165,24 @@ while (/(^|\n)(.{0,$prefix_max})$copyright_re/g)
   {
     $leading = "$1$2";
     $prefix = $2;
-    if ($prefix =~ /^(\s*\/)\*(\s*)$/)
+
+    foreach my $p (keys %multiline_prefixes)
       {
-        $prefix =~ s,/, ,;
-        my $prefix_ws = $prefix;
-        $prefix_ws =~ s/\*/ /; # Only whitespace.
-        if (/\G(?:[^*\n]|\*[^\/\n])*\*?\n$prefix_ws/)
+        my $prefix_re = quotemeta($p);
+
+        if ($prefix =~ /^\s*$prefix_re\s*$/)
           {
-            $prefix = $prefix_ws;
+            $prefix =~ s,$prefix_re,$multiline_prefixes{$p},;
+
+            # FIXME: I don't understand the following but I think it is
+            # specific to handling "/* " (turning it into " * ").
+            # No longer needed with the current approach?
+            # my $prefix_ws = $prefix;
+            # $prefix_ws =~ s/\*/ /; # Only whitespace.
+            # if (/\G(?:[^*\n]|\*[^\/\n])*\*?\n$prefix_ws/)
+            #   {
+            #     $prefix = $prefix_ws;
+            #   }
           }
       }
     $ws_re = '[ \t\r\f]'; # \s without \n

update-copyright request and patch...

Reply via email to