Hi there! Our local CVSROOT scripts are based on a 1997 year version of FreeBSD CVSROOT scripts, slightly modified to fit local needs, and we recently noticed a problem with not getting email notifications if the affected repository path has spaces in its name. I have verified that the modern CVSROOT scripts (which I am planning to migrate to soon) are subject to this problem as well.
There is a couple of problems actually. The first problem turns out to be a known limitation in cvs(1), in a way how it constructs the argument to the log_accum.pl script (using the %s in CVSROOT/loginfo). With %s, both repository path and the list of files (added, removed, or changed) in that directory are passed as a single argument to the log_accum.pl script, separated by a single whitespace character, which obviously creates a problem for a parser in log_accum.pl that does a simple split(). The solution that works here and was inspired by reading the following comment in src/contrib/cvs/src/logmsg.c, describing the % format specifiers, /* All other characters, we insert an empty field (but we do put in the comma separating it from other fields). This way if future CVS versions add formatting characters, one can write a loginfo file which at least won't blow up on an old CVS. */ /* Note that people who have to deal with spaces in file and directory names are using space to get a known delimiter for the directory name, so it's probably not a good idea to ever define that as a formatting character. */ was to use %{ s} to get a known " ," delimiter. (Chances still are that the file/directory name will contain the " ," sequence, but this is less likely.) With this fix, no emails were lost anymore, but the contents were still damaged. If a directory name had a space in its name, the log_accum.pl::get_revision_number() that does a simple split() when parsing the "Repository revision" string from "cvs -Qn status", hits the whitespace problem. That was easy to fix, by limiting splitting to four elements only. Then I hit a real PROBLEM. The log_accum.pl script parses the standard log message generated by logmsg.c, to build three lists of added, removed, and modified files, each hashed by the "tag" (yes, it's possible to commit to different branches in one commit, and even to commit to a single file twice in a single commit). I first thought that I could just take the list of files from the argument to the script (generated by %{ s}), but there is no indication of "added/removed/changed", and, more importantly, there is no "tag" information. And the problem with building these lists by parsing the log message is that the lists of added, removed, and modified files are formatted by logmsg.c as separated by a single whitespace character (and to make it fit the 70-column width), which creates the same problem if a file has a space in a name. My first hack was to patch logmsg.c to have it list files one per line, so, for example, instead of Modified files: a b c I have now been getting Modified files: a b c This worked well, but I wanted a real SOLUTION. The solution I have found is to implement the %T modifier for "loginfo" scripts, which prints the tag (or HEAD for trunk). Combined with the %V and %v modifiers which print "NONE" for added and removed files, respectively, I have been able to fix the log_accum.pl's logic to not depend on the contents of the log message to build lists of added, removed, and modified files. Attached in this email are: - a patch for src/contrib/cvs that implements the %T (tag) modifier for CVSROOT/loginfo scripts, - a patch for log_accum.pl and loginfo that uses %T to deal with whitespaces in directory and file names. I'd like some input here before I submit this patch to CVS developers. P.S. The patch also removes two /usr/local/bin/awake uses from log_accum.pl, as these made it unuseable for third party application. Josef, Peter suggested me to ask you to please address this particular problem with "awake", and to make this script useful for third-parties again. Cheers, -- Ruslan Ermilov Sysadmin and DBA, [EMAIL PROTECTED] Sunbay Software Ltd, [EMAIL PROTECTED] FreeBSD committer
Index: doc/cvs.texinfo =================================================================== RCS file: /home/ncvs/src/contrib/cvs/doc/cvs.texinfo,v retrieving revision 1.1.1.13 diff -u -p -r1.1.1.13 cvs.texinfo --- doc/cvs.texinfo 2 Dec 2002 03:13:37 -0000 1.1.1.13 +++ doc/cvs.texinfo 21 Jun 2003 20:21:34 -0000 @@ -12888,6 +12888,8 @@ separators. The format characters are: @table @t @item s file name [EMAIL PROTECTED] T +tag @item V old version number (pre-checkin) @item v Index: src/logmsg.c =================================================================== RCS file: /home/ncvs/src/contrib/cvs/src/logmsg.c,v retrieving revision 1.11 diff -u -p -r1.11 logmsg.c --- src/logmsg.c 2 Dec 2002 03:17:48 -0000 1.11 +++ src/logmsg.c 21 Jun 2003 20:24:12 -0000 @@ -683,6 +683,15 @@ title_proc (p, closure) strlen (str_list) + strlen (p->key) + 5); (void) strcat (str_list, p->key); break; + case 'T': + str_list = + xrealloc (str_list, + (strlen (str_list) + + (li->tag ? strlen (li->tag) : 0) + + 10) + ); + (void) strcat (str_list, (li->tag ? li->tag : "HEAD")); + break; case 'V': str_list = xrealloc (str_list, @@ -767,6 +776,7 @@ logfile_write (repository, filter, messa `}' as separators. The format characters are: s = file name + T = tag V = old version number (pre-checkin) v = new version number (post-checkin) Index: src/mkmodules.c =================================================================== RCS file: /home/ncvs/src/contrib/cvs/src/mkmodules.c,v retrieving revision 1.12 diff -u -p -r1.12 mkmodules.c --- src/mkmodules.c 2 Sep 2002 05:57:13 -0000 1.12 +++ src/mkmodules.c 21 Jun 2003 20:16:33 -0000 @@ -69,6 +69,7 @@ static const char *const loginfo_content "# characters are:\n", "#\n", "# s = file name\n", + "# T = tag\n", "# V = old version number (pre-checkin)\n", "# v = new version number (post-checkin)\n", "#\n",
diff --exclude=CVS -ru CVSROOT-freebsd/loginfo CVSROOT/loginfo --- CVSROOT-freebsd/loginfo Fri Feb 28 05:35:48 2003 +++ CVSROOT/loginfo Sat Jun 21 22:55:15 2003 @@ -27,4 +27,4 @@ #DEFAULT (echo ""; id; echo %s; date; cat) >> $CVSROOT/CVSROOT/commitlog # or #DEFAULT (echo ""; id; echo %{sVv}; date; cat) >> $CVSROOT/CVSROOT/commitlog -DEFAULT $CVSROOT/CVSROOT/log_accum.pl %s +DEFAULT $CVSROOT/CVSROOT/log_accum.pl %{ TVvs} diff --exclude=CVS -ru CVSROOT-freebsd/log_accum.pl CVSROOT/log_accum.pl --- CVSROOT-freebsd/log_accum.pl Fri Feb 28 20:28:11 2003 +++ CVSROOT/log_accum.pl Sat Jun 21 23:00:25 2003 @@ -34,10 +34,7 @@ # ############################################################ my $STATE_NONE = 0; -my $STATE_CHANGED = 1; -my $STATE_ADDED = 2; -my $STATE_REMOVED = 3; -my $STATE_LOG = 4; +my $STATE_LOG = 1; my $BASE_FN = "$cfg::TMPDIR/$cfg::FILE_PREFIX"; my $LAST_FILE = $cfg::LAST_FILE; @@ -233,7 +230,7 @@ while (<RCS>) { if (/^[ \t]*Repository revision/) { chomp; - my @revline = split; + my @revline = split(" ", $_, 4); $revision = $revline[2]; $revline[3] =~ m|^$CVSROOT/+(.*),v$|; $rcsfile = $1; @@ -635,8 +632,7 @@ # Initialize basic variables # my $input_params = $ARGV[0]; -my ($directory, @filenames) = split " ", $input_params; [EMAIL PROTECTED] = split(' ', $input_params); +my ($directory, @fileinfo) = split " ,", $input_params; my @path = split('/', $directory); my $dir; @@ -647,6 +643,23 @@ } $dir = $dir . "/"; +my @filenames; +my %added_files; # Hashes containing lists of files +my %changed_files; # that have been changed, keyed +my %removed_files; # by branch tag. +foreach (@fileinfo) { + my ($tag, $oldrev, $newrev, $filename) = split(/,/, $_, 4); + push @filenames, $filename; + if ($oldrev eq "NONE") { + push @{ $added_files{$tag} }, $filename; + } elsif ($newrev eq "NONE") { + push @{ $removed_files{$tag} }, $filename; + } else { + push @{ $changed_files{$tag} }, $filename; + } + &append_line($TAGS_FILE, $tag); +} + # # Throw some values at the developer if in debug mode # @@ -697,7 +710,6 @@ &do_changes_file(@text); &mail_notification(@text); - system("/usr/local/bin/awake", $directory); &cleanup_tmpfiles(); exit 0; } @@ -705,49 +717,26 @@ # # Iterate over the body of the message collecting information. # -my %added_files; # Hashes containing lists of files -my %changed_files; # that have been changed, keyed -my %removed_files; # by branch tag. - my @log_lines; # The lines of the log message. -my $tag = "HEAD"; # Default branch is HEAD. my $state = $STATE_NONE; # Initially in no state. while (<STDIN>) { s/[ \t\n]+$//; # delete trailing space - # parse the revision tag if it exists. - if (/^Revision\/Branch:(.*)/) { $tag = $1; next; } - if (/^[ \t]+Tag: (.*)/) { $tag = $1; next; } - if (/^[ \t]+No tag$/) { $tag = "HEAD"; next; } - - # check for a state change, guarding against similar markers - # in the log message itself. - unless ($state == $STATE_LOG) { - if (/^Modified Files/) { $state = $STATE_CHANGED; next; } - if (/^Added Files/) { $state = $STATE_ADDED; next; } - if (/^Removed Files/) { $state = $STATE_REMOVED; next; } - if (/^Log Message/) { $state = $STATE_LOG; next; } + # don't do anything if we're not in a state. + if ($state == $STATE_NONE) { + $state = $STATE_LOG if /^Log Message/; + next; } - # don't so anything if we're not in a state. - next if $state == $STATE_NONE; - # collect the log line (ignoring empty template entries)? if ($state == $STATE_LOG) { next if /^(.*):$/ and $cfg::TEMPLATE_HEADERS{$1}; push @log_lines, $_; } - - # otherwise collect information about which files changed. - my @files = split; - push @{ $changed_files{$tag} }, @files if $state == $STATE_CHANGED; - push @{ $added_files{$tag} }, @files if $state == $STATE_ADDED; - push @{ $removed_files{$tag} }, @files if $state == $STATE_REMOVED; } -&append_line($TAGS_FILE, $tag); # # Strip leading and trailing blank lines from the log message. @@ -896,7 +885,6 @@ &mail_notification(@log_msg); } -system("/usr/local/bin/awake", $directory); &cleanup_tmpfiles(); exit 0; # EOF
pgp00000.pgp
Description: PGP signature