Failing to use gnulib bootstrap in libtool

Gary V. Vaughan Thu, 02 Sep 2010 09:37:02 -0700

Is gnulib bootstrap designed for reuse in other projects?

I'm finding it extremely difficult to understand a lot of the code, let alone 
incorporate it into Libtool.  I have some fixes for the obvious bugs, and a lot 
of questions about the design choices, below.  In short, it could use a 
complete rewrite.  I'd be happy to do the work, as long as I can be reasonably 
confident that it will be accepted upstream, and that the more obtuse parts of 
the incumbent script are explained to me.


These are the issues I've found so far:


1. gnulib_mk
============

> # Name of the Makefile.am
> gnulib_mk=gnulib.mk

What is this for?  The only use in the rest of the script is inside the slurp() 
function, who's purpose I cannot fathom:

>      if test $file = Makefile.am && test "X$gnulib_mk" != XMakefile.am; then
>        copied=$copied${sep}$gnulib_mk; sep=$nl
>        remove_intl='/^[^#].*\/intl/s/^/#/;'"s!$bt_regex/!!g"
>        sed "$remove_intl" $1/$dir/$file | cmp - $dir/$gnulib_mk > /dev/null 
> || {
>          echo "$0: Copying $1/$dir/$file to $dir/$gnulib_mk ..." &&
>          rm -f $dir/$gnulib_mk &&
>          sed "$remove_intl" $1/$dir/$file >$dir/$gnulib_mk
>        }

It seems to have something to do with removing the intl subdirectory, so must 
pertain to gettext. But then why is it called `gnulib.mk'?


2. source_base, m4_base etc
===========================

Why hardcode another copy of these locations into bootstrap, where they can 
fall out of sync.  Far better to either extract them from or inject them into 
`gnulib-cache.m4'.  Extracting them with `autom4te --language=Autoconf 
--trace=gl_SOURCE_BASE --trace=gl_M4_BASE ... configure.ac' seems like the most 
robust way of tackling this problem to me.


3. MSGID_BUGS_ADDRESS
=====================

> msgid_bugs_address=bug-$pack...@gnu.org

Wouldn't it be better to extract this from AC_INIT at the same time as you run 
sed over "$extract_package_name", rather than guess a default, and require 
keeping two references to the address in sync?

    tab='       '
    extract_package_values's,#.*$,,; s,^dnl .*$,,; s, dnl .*$,,;
        /AC_INIT(/                   {
            s|^.*AC_INIT([['"$tab"' ]*||
            s|^\([^],]*\)[]['"$tab"' ]*,|package="\1"; package_name="\1",|
            s|,[['"$tab"' ]*\([^],]*\)[]['"$tab"' ]*,|; version="\1",|
            s|,[['"$tab"' ]*\([^])]*\).*$|; package_bugreport="\1"|
            s|^package="GNU |package="|
            h
            s|;.*$||
            y|ABCDEFGHIJKLMNOPQRSTUVWXYZ|abcdefghijklmnopqrstuvwxyz|
            x
            s|^[^;]*; ||
            G
            p
        }
    '
    eval `sed -n "$extract_package_values"` || exit $?

    ...

    MSGID_BUGS_ADDRESS=$package_bugreport


4. bootstrap.conf
=================

Why is bootstrap.conf called below the definition of find_tool(), but the other 
functions
are redefined much later on.  It would be far more flexible to encapsulate most 
of the
functionality of `bootstrap' into shell functions (preferably well commented so 
that they
can be easily understood!) before sourcing `bootstrap.conf' so that those 
functions could
be overridden safely before they are called by the remaining code.

I don't like the way gnulib `bootstrap' forces a `gnulib-tool --import' 
invocation on every
run, when a `gnulib-tool --update' is quite sufficient most of the time.  If 
the calls
to gnulib-tool were in a function set before sourcing bootstrap.conf, I could 
redefine the
function for Libtool without upsetting other users of the script by fighting 
for a change
of semantics.


5. option parsing
=================

The option parse loop is not as robust as it could be: For example, it doesn't 
support
`./bootstrap -fc --gnulib-srcdir ../gnulib' for a couple of reasons.

I wonder whether you would accept a generated script to check in to the gnulib 
repo,
along with a few lines of shell and an extra m4sh file to regenerate that file 
whenever
the source is edited?

Here's the source:

dnl SHORT   LONG                   DEFAULT              INIT
dnl ----------------------------------------------------------------------
M4SH_GETOPTS(
  [c],    [--copy],                 [],                  [],
  [f],    [--force],                [],                  [],
  [!],    [--gnulib-srcdir],        [],                  [
        GNULIB_SRCDIR="$optarg"],
  [],     [--skip-po],              [],                  [],
[[
  test $# -gt 0 \
    && func_fatal_help "too many arguments: $@"
]])


And here's the generated option parsing loop:

debug_cmd=:
exit_cmd=:

# Option defaults:
opt_copy=false
opt_force=false
opt_skip_po=false

# Parse options once, thoroughly.  This comes as soon as possible in the
# script to make things like `--version' happen as quickly as we can.
{
  # this just eases exit handling
  while test $# -gt 0; do
    opt="$1"
    shift
    case $opt in
      --debug|-x)    opt_debug_cmd='set -x'
            func_echo "enabling shell trace mode"
            $opt_debug_cmd
            ;;
      --copy|-c)
            opt_copy=:
            ;;
      --force|-f)
            opt_force=:
            ;;
      --gnulib-srcdir)
            test $# = 0 && func_missing_arg $opt && break
            optarg="$1"
            opt_gnulib_srcdir="$optarg"
            GNULIB_SRCDIR="$optarg"
            shift
            ;;
      --skip-po)
            opt_skip_po=:
            ;;

      -\?|-h)        func_usage                  ;;
      --help)        func_help                   ;;
      --version)     func_version                ;;

      # Separate optargs to long options:
      --*=*)
            func_split_long_opt "$opt"
            set dummy "$func_split_long_opt_name" "$func_split_long_opt_arg" 
${1+"$@"}
            shift
            ;;

      # Separate non-argument short options:
      -\?*|-h*|-c*|-f*)
            func_split_short_opt "$opt"
            set dummy "$func_split_short_opt_name" "-$func_split_short_opt_arg" 
${1+"$@"}
            shift
            ;;

      --)        break                    ;;
      -*)        func_fatal_help "unrecognized option \`$opt'" ;;
      *)        set dummy "$opt" ${1+"$@"};    shift; break  ;;
    esac
  done

  # Validate options:
  test 0 -gt 0 \
    && func_fatal_help "too many arguments: "

  # Bail if the options were screwed
  $exit_cmd $EXIT_FAILURE
}

Additionally, some of the prerequisites files needed before running autotools 
are generated files in Libtool, namely `ltmain.sh' and `ltversion.m4'.  I've 
tried to handle that by putting the generation code, which calls `make', in 
`bootstrap.conf'.  But then it starts outputting things before the options are 
parsed, which makes for very surprising behaviour with `./bootstrap --help'.  
Making the parse loop do nothing but set variables to indicate options that 
have been seen already, handle `--help' and the like early on, and consume the 
options it has processed would make for easy option extensions in 
`bootstrap.conf'.


6. AC_CONFIG_AUX_DIR detection
==============================

Why another bunch of forks?  Might as well get this data while running sed over
configure the first time, by adding the following to `extract_package_values':

    /AC_CONFIG_AUX_DIR(/         {
        s|^.*AC_CONFIG_AUX_DIR([['"$tab"' ]*\([^])]*\).*$|config_aux_dir=\1|
        p
    }

And change the following code:

> if test $found_aux_dir = no; then
>   echo "$0: expected line not found in configure.ac. Add the following:" >&2
>   echo "  AC_CONFIG_AUX_DIR([$build_aux])" >&2
>   exit 1
> fi

to:

    test -n "$config_aux_dir" || {
        echo "$0: expected line not found in configure.ac.  Add the following:" 
>&2
        echo "  AC_CONFIG_AUX_DIR([$build_aux])
        exit 1
    }


7. buildreqs
============

Why do we have to keep 2 copies of this information in sync?  `bootstrap' can
easily extract the prequisite versions of Autoconf, Automake, Libtool and
Gettext from configure.ac for us.  I didn't have this in my bootstrap script
so I don't have code to paste in, but it seems like it would just be a few more
lines in `extract_package_values'.

It would be nice if gnulib also had a template for README-prereq with 
instructions
for Autotools already entered.


8. gnulib submodule
===================

> git_modules_config () {
>   test -f .gitmodules && git config --file .gitmodules "$@"
> }
>
> gnulib_path=`git_modules_config submodule.gnulib.path`
> : ${gnulib_path=gnulib}

So when I don't have .gitmodules, gnulib_path is set to the empty string, which
prevents the follow line from falling back to a default value.

Actually, I can't follow the logic of the case statement that follows, although
I think it is not supposed to do nothing at all when I have no `.gitmodules' 
file,
and pass `--gnulib-srcdir=../gnulib' to try and get a reference submodule 
installed.

The gnulib submodule code in GNU M4 is, by comparison, easy to follow... and it
works.  Aside from adding enough comments to make the logic understandable, and
fixing it to work in the common case of `--gnulib-srcdir=../gnulib' with no 
`.gitmodules'
file, I think the whole thing should be put inside a function before 
`bootstrap.conf',
so that I can override it with my preferred semantics.

And why does GNULIB_SRCDIR contain the location of the referenced gnulib tree
for the first half of the script, and then get reused to hold the location of 
the
submodule itself for the rest of the script?  `$gnulib_path' seems to hold the
right value already, so there's no need to change the meaning of `GNULIB_SRCDIR'
partway down.


9. i18n
=======

Libtool doesn't have this.  And my brain hurts enough from trying to make sense 
of
the rest of this file, so I haven't looked at any of the po files code.  
Although I
think someone who understands the issues should, since it probably needs some 
love
too.


10. slurp()
==========

I can't figure out what this function's purpose is.  It seems to care about the
`gnulib.mk' file, and probably has something to do with making lists of files
that gettext doesn't like and editing away the discrepancies when copying
imported files out of the temporary tree into their correct location?  If 
someone
could explain what it's for (I can follow the code line-by-line, but out of 
context
it still makes no sense), or commit some comments that would be a HUGE help.


11. gnulib_tool_options
=======================

This is broken quite badly:

  `--import': Most of the time I'd rather use `--update', but there's no way
        to override it without using a gnulib-local patch (and I couldn't even
        get that to work for `bootstrap' since it's not even in a module).
  `--lib $gnulib_name' (and others): where `gnulib_name=lib$package' by default.
        Surely most people want to call it `libgnu' (not `liblibtool'!)?  And in
        any case, these are already stored in gnulib-cache.m4, so we shouldn't
        have to keep two references in sync.

I really don't understand why gnulib is installed to a temporary directory and
bits of it copied across with name mangling to compensate.  Is this to 
work-around
a timestamp issue or something?  If so, I expect calling `gnulib-tool --update'
when appropriate would be far cleaner.  If not, I'd still like to be able to
override it in Libtool without having to patch the gnulib `bootstrap' every time
I copy it over.

Additionally, this would be better set above where `bootstrap.conf' is sourced
so it can be overridden.


12. AUTOHEADER and lack of AUTORECONF
=====================================

> # Skip autoheader if it's not needed.
> grep -E '^[    ]*AC_CONFIG_HEADERS?\>' configure.ac >/dev/null ||
>   AUTOHEADER=true

Autoreconf already does this, and much more reliably since it uses m4 traces to
see whether AC_CONFIG_HEADERS was tucked away in aclocal.m4, or the result of
another expansion etc. etc.

And for that matter, why are libtoolize, aclocal and others called directly
at all?  Autoreconf does this too, and much more reliably (although I recall
it had some issues with libtoolize at one point I don't seem to trip over
that one any more).

>       "${ACLOCAL-aclocal} --force -I m4" \

And why are the flags hard-coded?  These are already stored in ACLOCAL_AMFLAGS,
not to mention that my macro directory is libltdl/m4, so `bootstrap' as is
can't find them.

Autoreconf does a better job here too.  I suggest replacing this entire block
with:

    ${AUTORECONF-autoreconf} $autoreconf_flags

And either putting in a function above where `bootstrap.conf' is sourced so it
can be overriden, if autoreconf isn't quite good enough.

Actually, Libtool has several directories to reconfigure at bootstrap time,
so I'd much prefer a function that I can redefine to loop through the list of
directories.

Cheers,
-- 
Gary V. Vaughan (g...@gnu.org)

PGP.sig
Description: This is a digitally signed message part

Failing to use gnulib bootstrap in libtool

Reply via email to