[adding bug-autoconf] On 1/29/19 2:18 PM, Eric Blake wrote: > On 1/29/19 12:57 PM, Siddhesh Poyarekar wrote: >> From: Siddhesh Poyarekar <siddh...@sourceware.org> >> >> * m4/regex.m4 (gl_REGEX): Add extra escape characters to >> regular expressions. >> --- >> >> The m4 preprocessor eats up half the escape characters, so give it twice >> as much. I ran into this when running tests for glibc 2.29 release and >> verified that this patch fixes the problem. > > Which versions of m4 and autoconf are you seeing this under? Can you > show actual snippets from the generated configure file showing that \ > was eaten? And why are you only touching some of the lines, rather than > all places where \\ appears in the regex.m4 file? This fix feels fishy, > and I seriously doubt that escape characters are being eaten by m4, but > I would like to make sure we have a real root cause understanding what > prompted this patch.
Aha - it is NOT m4, but the shell handling of \ in a heredoc in unquoted context that is doing it. Compare: $ bash -c 'cat <<ABC a\b\\c\\\d\\\\e"f\g\\h\\\i\\\\j" ABC' a\b\c\\d\\e"f\g\h\\i\\j" $ bash -c 'cat <<\ABC a\b\\c\\\d\\\\e"f\g\\h\\\i\\\\j" ABC' a\b\\c\\\d\\\\e"f\g\\h\\\i\\\\j" $ dash -c 'cat <<ABC a\b\\c\\\d\\\\e"f\g\\h\\\i\\\\j" ABC' a\b\c\\d\\e"f\g\h\\i\\j" $ dash -c 'cat <<\ABC a\b\\c\\\d\\\\e"f\g\\h\\\i\\\\j" ABC' a\b\\c\\\d\\\\e"f\g\\h\\\i\\\\j" > >> +++ b/m4/regex.m4 >> @@ -204,7 +204,7 @@ AC_DEFUN([gl_REGEX], >> & ~RE_CONTEXT_INVALID_DUP >> & ~RE_NO_EMPTY_RANGES); >> memset (®ex, 0, sizeof regex); >> - s = re_compile_pattern ("[[:alnum:]_-]\\\\+$", 16, ®ex); >> + s = re_compile_pattern ("[[:alnum:]_-]\\\\\\\\+$", 16, ®ex); m4/regex.m4 is using an AC_LANG_PROGRAM() macro, which has the unfortunate longstanding behavior at least in autoconf 2.69, but it looks like it goes back much further to older releases, of eventually expanding as: m4_define([AC_LANG_CONFTEST(C)], [cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $1 _ACEOF]) which produces an unquoted heredoc and therefore eats duplicated \ inside any program snippets. We can't switch to quoted heredocs (because users may have come to expect expansion of $shellvar when writing their programs), and the problem is not apparent with the most common usage of a single \. We could teach autoconf 2.70 to double up \\ automatically in a lang snippet (which scales nicer than every .m4 file having to double up, and makes it so you can copy snippets back and forth between .m4 and .c files without having to remember to add/subtract \) - but there's still the issue of catering to distros still using older autoconf (gnulib can force a new behavior borrowed from a patched autoconf, but not everyone uses gnulib). Looking at m4/fnmatch.m4, it looks like we are already used to the idea of doubling up \\ that will pass through AC_LANG_PROGRAM() and the unquoted heredoc; on that grounds, your patch is correct. If nothing else, the Autoconf manual should be documented to mention this behavior of \\ in source code snippets in m4 files (if it does not go one step further to auto-patch them). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
signature.asc
Description: OpenPGP digital signature