On 2016-03-11 22:22, Michael Haubenwallner wrote: > Hi Peter, > > thanks for looking at the patch! > > On 03/10/2016 12:29 PM, Peter Rosin wrote: >> Hi Michael, >> >> I had a look since I wrote a patch for POSIX nm a couple of years ago >> that I never submitted (I didn't see any use case) which looked very >> similar, excepting the AIX-ism in your version. >> >> On 2016-03-10 10:01, Michael Haubenwallner wrote: >>> * m4/libtool.m4 (LT_PATH_NM): Detect POSIX-compatible nm for AIX. In >>> BSD mode, the AIX nm does not tell whether a symbol is weak, need to use >>> POSIX mode instead. >>> (_LT_CMD_GLOBAL_SYMBOLS): Support POSIX-compatible nm. Reorder to allow >>> for platform specific hooks during transformation of global_symbol_pipe >>> into C source code. For AIX, set hook to transform even weak text >>> symbols as text symbols. >>> (_LT_LINKER_SHLIBS): Use global_symbol_pipe to simplify forming the >>> export_symbols_cmds for AIX. >>> --- >>> m4/libtool.m4 | 101 >>> ++++++++++++++++++++++++++++++++-------------------------- >>> 1 file changed, 55 insertions(+), 46 deletions(-) >>> >>> diff --git a/m4/libtool.m4 b/m4/libtool.m4 >>> index 2c0e657..6134522 100644 >>> --- a/m4/libtool.m4 >>> +++ b/m4/libtool.m4 >>> @@ -3755,10 +3755,10 @@ _LT_DECL([], [want_nocaseglob], [1], >>> >>> # LT_PATH_NM >>> # ---------- >>> -# find the pathname to a BSD- or MS-compatible name lister >>> +# find the pathname to a BSD-, POSIX- or MS-compatible name lister >>> AC_DEFUN([LT_PATH_NM], >>> [AC_REQUIRE([AC_PROG_CC])dnl >>> -AC_CACHE_CHECK([for BSD- or MS-compatible name lister (nm)], lt_cv_path_NM, >>> +AC_CACHE_CHECK([for BSD-, POSIX- or MS-compatible name lister (nm)], >>> lt_cv_path_NM, >>> [if test -n "$NM"; then >>> # Let the user override the test. >>> lt_cv_path_NM=$NM >>> @@ -3808,6 +3808,26 @@ else >>> : ${lt_cv_path_NM=no} >>> fi]) >>> if test no != "$lt_cv_path_NM"; then >>> + case $host_os in >>> + aix[[4-9]]*) >>> + # With AIX nm we need the '-l' flag to get the "weak" information >>> + # for the Import File, but '-l' is ignored with the '-B' flag. So >>> + # we use the '-P' (POSIX) flag instead. As users often provide the >>> + # '-B' flag, which conflicts with '-P', we drop any provided flag. >>> + # AIX nm needs the '-C' flag to disable demangling. For both GNU >>> + # and AIX nm, the '-g' flag shows public (global) symbols only, >>> + # and the '-p' flag disables sorting to improve performance. >>> + set dummy $lt_cv_path_NM >>> + case `@S|@2 -V 2>&1` in >>> + *GNU* | *'with BFD'*) >>> + lt_cv_path_NM="@S|@2 -Bgp" >>> + ;; >>> + *) >>> + lt_cv_path_NM="@S|@2 -PlCgp" >>> + ;; >>> + esac >>> + ;; >>> + esac >> >> You are overriding the user provided $NM. Not good. If a user says >> NM="nm --this-will-not-work", then you will have to trust that even if >> it is not likely to work. User error, so what? Adding -Bgp or -PlCgp >> can only be done when the user has not specified $NM. > > Agreed. I've added a check whether NM will mark weak symbols instead.
I was thinking that you needed to try various flags for each nm in the mentioned loop until you find a good nm/flags combo, and keep looking if you think you might find an even better combo later (i.e. what is there today, where a BSD nm is preferred over other name listers, but tweaked to suite AIX which seemingly prefers posix nm above all else). Then, when you have found an nm/flags combo (or if the user has provided it), and this part was already ok in the patch, you make libtool detect if the $NM interface is posix, bsd, MS dumpbin or ..., and build the symbol pipe accordingly. >> Yes, I see that >> AIX has previously added nm flags behind the back of the user, but there >> is no reason to continue with that now that you are changing things. >> >> You need to modify innards of the lt_tmp_nm loop in the else branch >> a few lines up (just above the context). >> >>> NM=$lt_cv_path_NM >>> else >>> # Didn't find any BSD compatible name lister, look for dumpbin. >>> @@ -3832,7 +3852,7 @@ fi >>> test -z "$NM" && NM=nm >>> _LT_SET_TOOL_ABI_FLAG([NM]) >>> AC_SUBST([NM]) >>> -_LT_DECL([], [NM], [1], [A BSD- or MS-compatible name lister])dnl >>> +_LT_DECL([], [NM], [1], [A BSD-, POSIX- or MS-compatible name lister])dnl >>> >>> AC_CACHE_CHECK([the name lister ($NM) interface], [lt_cv_nm_interface], >>> [lt_cv_nm_interface="BSD nm" >>> @@ -3847,6 +3867,8 @@ AC_CACHE_CHECK([the name lister ($NM) interface], >>> [lt_cv_nm_interface], >>> cat conftest.out >&AS_MESSAGE_LOG_FD >>> if $GREP 'External.*some_variable' conftest.out > /dev/null; then >>> lt_cv_nm_interface="MS dumpbin" >>> + elif $GREP '^[[ ]]*_*some_variable' conftest.out > /dev/null; then >>> + lt_cv_nm_interface="POSIX nm" >> >> Isn't this a pretty weak check, perhaps append ' B' and remove the >> possibility >> for leading whitespace? (see my last comment below for reasoning on spaces) > > As long as the expected symbol name comes first, isn't it POSIX then? > Anyway, 've added "[\t ][\t ]*[A-Za-z]" now, as $symcode is defined later. > And there is no check for BSD style after all. Since it is POSIX output, my point is that it should be fairly safe to assume B as the symbol type, maybe it could be a D if the tools do not put zero-vars in bss, but why wouldn't they? So, perhaps [BD] is a more palatable pattern? I simply don't think you need to match every possible symbol type. Do you? >> >>> fi >>> rm -f conftest*]) >>> ])# LT_PATH_NM >>> @@ -4012,8 +4034,33 @@ symcode='[[BCDEGRST]]' >>> # Regexp to match symbols that can be accessed directly from C. >>> sympat='\([[_A-Za-z]][[_A-Za-z0-9]]*\)' >>> >>> +if test "$lt_cv_nm_interface" = "MS dumpbin"; then >>> + # Gets list of data symbols to import. >>> + lt_cv_sys_global_symbol_to_import="sed -n -e 's/^I .* \(.*\)$/\1/p'" >>> + # Adjust the below global symbol transforms to fixup imported variables. >>> + lt_cdecl_hook=" -e 's/^I .* \(.*\)$/extern __declspec(dllimport) char >>> \1;/p'" >>> + lt_c_name_hook=" -e 's/^I .* \(.*\)$/ {\"\1\", (void *) 0},/p'" >>> + lt_c_name_lib_hook="\ >>> + -e 's/^I .* \(lib.*\)$/ {\"\1\", (void *) 0},/p'\ >>> + -e 's/^I .* \(.*\)$/ {\"lib\1\", (void *) 0},/p'" >>> +else >>> + # Disable hooks by default. >>> + lt_cv_sys_global_symbol_to_import= >>> + lt_cdecl_hook= >>> + lt_c_name_hook= >>> + lt_c_name_lib_hook= >>> +fi >>> + >>> # Define system-specific variables. >>> case $host_os in >>> +aix[[4-9]]*) >>> + case `$NM -V 2>&1` in >>> + *GNU* | *'with BFD'*) ;; >>> + *) >>> + symcode='[[BDLTVWZ]]' >>> + lt_cdecl_hook=" -e 's/^W/T/p'" # weak text symbol >>> + esac >>> + ;; >> >> Why does AIX need to export weak symbols, when W symbols are not >> handled in the nm output on other systems? This seems inconsistent? > > Erm, with GNU nm, $symcode actually does contain W. And a weak symbol > is referenced as variable in the lt_*_LTX_preloaded_symbols array, > even if it might actually be a text symbol... What do I miss here? It probably me missing something, like looking at the default symcodes instead of the GNU nm symcodes. My bad. > Why there is need for the weakness information: The aix-soname=svr4 > feature uses Import Files to provide filename-based shared library > versioning, so a subsequent linker does actually link against a text > file rather than some binary shared object. And the Import File allows > to specify the weak keyword, while it is ignored in an Export File. > So the content of the Export File used to create a shared library is > provided as the Import File needed to link against that shared library. Ok, I clearly don't know this area, I was just asking because I thought I saw an inconsistency. I guess it is ok on other systems, and if not I guess that is not really your responsibility. Sorry for the noise... >> >>> aix*) >>> symcode='[[BCDT]]' >>> ;; >>> @@ -4054,23 +4101,6 @@ case `$NM -V 2>&1` in >>> symcode='[[ABCDGIRSTW]]' ;; >>> esac >>> >>> -if test "$lt_cv_nm_interface" = "MS dumpbin"; then >>> - # Gets list of data symbols to import. >>> - lt_cv_sys_global_symbol_to_import="sed -n -e 's/^I .* \(.*\)$/\1/p'" >>> - # Adjust the below global symbol transforms to fixup imported variables. >>> - lt_cdecl_hook=" -e 's/^I .* \(.*\)$/extern __declspec(dllimport) char >>> \1;/p'" >>> - lt_c_name_hook=" -e 's/^I .* \(.*\)$/ {\"\1\", (void *) 0},/p'" >>> - lt_c_name_lib_hook="\ >>> - -e 's/^I .* \(lib.*\)$/ {\"\1\", (void *) 0},/p'\ >>> - -e 's/^I .* \(.*\)$/ {\"lib\1\", (void *) 0},/p'" >>> -else >>> - # Disable hooks by default. >>> - lt_cv_sys_global_symbol_to_import= >>> - lt_cdecl_hook= >>> - lt_c_name_hook= >>> - lt_c_name_lib_hook= >>> -fi >>> - >>> # Transform an extracted symbol line into a proper C declaration. >>> # Some systems (esp. on ia64) link data and code symbols differently, >>> # so use this general approach. >>> @@ -4128,6 +4158,9 @@ for ac_symprfx in "" "_"; do >>> " s[1]~/^[@?]/{print f,s[1],s[1]; next};"\ >>> " s[1]~prfx {split(s[1],t,\"@\"); print >>> f,t[1],substr(t[1],length(prfx))}"\ >>> " ' prfx=^$ac_symprfx]" >>> + elif test "$lt_cv_nm_interface" = "POSIX nm"; then >>> + symxfrm="\\2 $ac_symprfx\\1 \\1" >>> + lt_cv_sys_global_symbol_pipe="sed -n -e 's/^[[ >>> ]]*$ac_symprfx$sympat[[ ]][[ ]]*\($symcode$symcode*\)[[ >>> ]][[ ]]*.*$opt_cr$/$symxfrm/p'" >> >> Do you really need to handle leading and multiple whitespace here? >> Posix, at least as seen here >> http://pubs.opengroup.org/onlinepubs/009696699/utilities/nm.html >> seems quite clear on no leading space and one space only as separator. > > Must admit that I haven't looked at the specs - and except for leading > ones, AIX nm does write multiple whitespaces between the fields. Eric cleared that up, I was wrong. Sorry for the noise. >>> else >>> lt_cv_sys_global_symbol_pipe="sed -n -e 's/^.*[[ >>> ]]\($symcode$symcode*\)[[ ]][[ >>> ]]*$ac_symprfx$sympat$opt_cr$/$symxfrm/p'" >>> fi >>> @@ -5009,19 +5042,7 @@ m4_if([$1], [CXX], [ >>> _LT_TAGVAR(exclude_expsyms, >>> $1)=['_GLOBAL_OFFSET_TABLE_|_GLOBAL__F[ID]_.*'] >>> case $host_os in >>> aix[[4-9]]*)export_symbols_cmds >>> - # If we're using GNU nm, then we don't want the "-C" option. >>> - # -C means demangle to GNU nm, but means don't demangle to AIX nm. >>> - # Without the "-l" option, or with the "-B" option, AIX nm treats >>> - # weak defined symbols like other global defined symbols, whereas >>> - # GNU nm marks them as "W". >>> - # While the 'weak' keyword is ignored in the Export File, we need >>> - # it in the Import File for the 'aix-soname' feature, so we have >>> - # to replace the "-B" option with "-P" for AIX nm. >>> - if $NM -V 2>&1 | $GREP 'GNU' > /dev/null; then >>> - _LT_TAGVAR(export_symbols_cmds, $1)='$NM -Bpg $libobjs $convenience >>> | awk '\''{ if (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 >>> == "W")) && ([substr](\$ 3,1,1) != ".")) { if (\$ 2 == "W") { print \$ 3 " >>> weak" } else { print \$ 3 } } }'\'' | sort -u > $export_symbols' >>> - else >>> - _LT_TAGVAR(export_symbols_cmds, $1)='`func_echo_all $NM | $SED -e >>> '\''s/B\([[^B]]*\)$/P\1/'\''` -PCpgl $libobjs $convenience | awk '\''{ if >>> (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == "L") || (\$ 2 >>> == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) && ([substr](\$ 1,1,1) != ".")) >>> { if ((\$ 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) { print \$ 1 " weak" >>> } else { print \$ 1 } } }'\'' | sort -u > $export_symbols' >>> - fi >>> + _LT_TAGVAR(export_symbols_cmds, $1)='$NM $libobjs $convenience | >>> $global_symbol_pipe | $EGREP -v " ($exclude_expsyms)$" | awk '\''{ kw = "" >>> } /^[[VWZ]] / { kw = " weak" } { print $ 3 kw }'\'' | sort -u > >>> $export_symbols' > > On a side note: > As the C++ value is identical to the C one for various platforms, > wouldn't it work for them to do something like: > _LT_TAGVAR(export_symbols_cmds, $1)=$_LT_TAGVAR(export_symbols_cmds) > _LT_TAGVAR(exclude_expsyms, $1)=$_LT_TAGVAR(exclude_expsyms) Would you not need to change tag between the get and the set for that to work as I think you intend? What am I missing? >>> ;; >>> pw32*) >>> _LT_TAGVAR(export_symbols_cmds, $1)=$ltdll_cmds >>> @@ -5464,19 +5485,7 @@ _LT_EOF >>> exp_sym_flag='-Bexport' >>> no_entry_flag= >>> else >>> - # If we're using GNU nm, then we don't want the "-C" option. >>> - # -C means demangle to GNU nm, but means don't demangle to AIX nm. >>> - # Without the "-l" option, or with the "-B" option, AIX nm treats >>> - # weak defined symbols like other global defined symbols, whereas >>> - # GNU nm marks them as "W". >>> - # While the 'weak' keyword is ignored in the Export File, we need >>> - # it in the Import File for the 'aix-soname' feature, so we have >>> - # to replace the "-B" option with "-P" for AIX nm. >>> - if $NM -V 2>&1 | $GREP 'GNU' > /dev/null; then >>> - _LT_TAGVAR(export_symbols_cmds, $1)='$NM -Bpg $libobjs $convenience | >>> awk '\''{ if (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == >>> "W")) && ([substr](\$ 3,1,1) != ".")) { if (\$ 2 == "W") { print \$ 3 " >>> weak" } else { print \$ 3 } } }'\'' | sort -u > $export_symbols' >>> - else >>> - _LT_TAGVAR(export_symbols_cmds, $1)='`func_echo_all $NM | $SED -e >>> '\''s/B\([[^B]]*\)$/P\1/'\''` -PCpgl $libobjs $convenience | awk '\''{ if >>> (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == "L") || (\$ 2 >>> == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) && ([substr](\$ 1,1,1) != ".")) >>> { if ((\$ 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) { print \$ 1 " weak" >>> } else { print \$ 1 } } }'\'' | sort -u > $export_symbols' >>> - fi >>> + _LT_TAGVAR(export_symbols_cmds, $1)='$NM $libobjs $convenience | >>> $global_symbol_pipe | $EGREP -v " ($exclude_expsyms)$" | awk '\''{ kw = "" >>> } /^[[VWZ]] / { kw = " weak" } { print $ 3 kw }'\'' | sort -u > >>> $export_symbols' > > The main motivation here is this simplification after all, > as this needs another symbol exclusion (patch 4/4), which > does make sense for the preloaded symbols list as well. Yes, it is much cleaner to adjust the symbol pipe according to $NM, than trying to "fix" $NM by adding flags. That part is nice indeed! Cheers, Peter