Hi Christos,

Whilst this initially fixed my tools/compat regen problem,
I'm not sure it's correct long term.

I think we need to revert this, and fix usr.bin/m4 -g (GNU)
emulation for m4 regexp() and patsubst() ?

A quick comparison of usr.bin/m4/gnum4.c twiddle() versus
  
https://www.gnu.org/software/gnulib/manual/html_node/emacs-regular-expression-syntax.html
we may need to make the following changes to twiddle():
1. convert m4 \{ and \} into ERE { and }
2. convert m4 { and } into ERE \{ and \}
3. possibly implement \b, \B, \`, and \'

but further analysis and addition of testsuites before & after,
comparing behaviour of GNU m4 versus our m4, is probably warranted.


Details:

autoconf 'm4_bregexp' is GNU m4 1.4.x 'regexp' (well, GNU m4 before 2.x),
and 'm4_bpatsubst' is m4 'patsubst'.
See:
- 
https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Redefined-M4-Macros.html


GNU m4 1.4.x defines 'regexp' as:
        Searches for regexp in string. The syntax for regular expressions is
        the same as in GNU Emacs, which is similar to BRE, Basic Regular
        Expressions in POSIX. See Syntax of Regular Expressions in the GNU
        Emacs Manual.
'patsubst' is similar:
        Searches string for matches of regexp, and substitutes replacement for
        each match. The syntax for regular expressions is the same as in GNU
        Emacs (see Regexp).
See:
- https://www.gnu.org/software/m4/manual/html_node/Regexp.html
- https://www.gnu.org/software/m4/manual/html_node/Patsubst.html


GNU m4 1.4.x defines 'regexp' as:
        Searches for regexp in string. The syntax for regular expressions is
        the same as in GNU Emacs, which is similar to BRE, Basic Regular
        Expressions in POSIX. See Syntax of Regular Expressions in the GNU
        Emacs Manual.
See:
- https://www.gnu.org/software/m4/manual/html_node/Regexp.html


Emacs regexps look to be BREs with extras including \| for alternation.
See:
- https://www.gnu.org/software/emacs/manual/html_node/emacs/Regexps.html
- 
https://www.gnu.org/software/emacs/manual/html_node/elisp/Regexp-Backslash.html

Note: except that GNU m4 code inspection shows it's uses the gnulib
implementation of Emacs regex, which is documented differently to Emacs'
own regex! See below.


GNU m4 uses gnulib for its regexp implementation (per code inspection),
and doesn't seem to change the syntax from the gnulib default, which
as far as I can tell is Emacs (RE_SYNTAX_EMACS 0. 
See:
- 
https://www.gnu.org/software/gnulib/manual/html_node/emacs-regular-expression-syntax.html
- https://www.gnu.org/software/gnulib/manual/html_node/Predefined-Syntaxes.html
Note that this seems to differ from the Emacs manual above.


POSIX BREs don't support '|' alternation at all, per NetBSD re_format(7):
        Obsolete (“basic”) regular expressions differ in several respects.
        ‘|’ is an ordinary character and there is no equivalent for its
        functionality.
and Linux (glibc) regex(7):
        Obsolete ("basic") regular expressions differ in several respects.
        '|', '+', and '?' are ordinary characters and there is no equivalent
        for their functionality.


It looks like usr.bin/m4's -g (GNU) compatibility option supports
rewriting the GNU m4 regexp() and patsubst() expressions to be POSIX ERE.
This is done in usr.bin/m4/gnum4.c by the twiddle() function if -g (mimic_gnu)
is in operation.

See intro for proposed solution.


regards,
Luke.



On 23-05-24 10:34, Christos Zoulas wrote:
  | Module Name:        src
  | Committed By:       christos
  | Date:               Wed May 24 14:34:16 UTC 2023
  | 
  | Modified Files:
  |     src/external/gpl3/autoconf/dist/lib/autoconf: general.m4
  | 
  | Log Message:
  | quote { to make regcomp happy
  | 
  | 
  | To generate a diff of this commit:
  | cvs rdiff -u -r1.1.1.1 -r1.2 \
  |     src/external/gpl3/autoconf/dist/lib/autoconf/general.m4
  | 
  | Please note that diffs are not public domain; they are subject to the
  | copyright notices on the relevant files.
  | 

  | Modified files:
  | 
  | Index: src/external/gpl3/autoconf/dist/lib/autoconf/general.m4
  | diff -u src/external/gpl3/autoconf/dist/lib/autoconf/general.m4:1.1.1.1 
src/external/gpl3/autoconf/dist/lib/autoconf/general.m4:1.2
  | --- src/external/gpl3/autoconf/dist/lib/autoconf/general.m4:1.1.1.1 Sat Jan 
16 13:36:00 2016
  | +++ src/external/gpl3/autoconf/dist/lib/autoconf/general.m4 Wed May 24 
10:34:16 2023
  | @@ -2119,7 +2119,7 @@ m4_define([AC_DEFINE_UNQUOTED], [_AC_DEF
  |  # no backslash, no command substitution, no complex variable
  |  # substitution, and no quadrigraphs.
  |  m4_define([_AC_DEFINE_UNQUOTED],
  | -[m4_if(m4_bregexp([$1], [\\\|`\|\$(\|\${\|@]), [-1],
  | +[m4_if(m4_bregexp([$1], [\\\|`\|\$(\|\$\{\|@]), [-1],
  |         [AS_ECHO(["AS_ESCAPE([$1], [""])"]) >>confdefs.h],
  |         [cat >>confdefs.h <<_ACEOF
  |  [$1]
  | 

Reply via email to