On 28/11/18 11:11 +0000, Jonathan Wakely wrote:
On 28/11/18 11:46 +0100, Richard Biener wrote:
On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwak...@redhat.com> wrote:
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.
The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.
To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.
In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.
Would it be possible to have both in libstdc++ and with differnet mangling of
different kind ensure compatibility between different user CUs? Or is
that too awkward for the user to get right?
It would mean duplicating a lot more code, which is already duplicated
once for the std::string ABI, so we'd have four permuations!
It still wouldn't ensure compatibility between different user CUs,
only between any user CU and any build of libstdc++. Different user
CUs would still disagree on the ABI of the types, and so couldn't pass
them between CUs. I see no advantage to supporting that for the
std::filesystem library (unlike std::string and std::iostream, which
are probably used in the majority of CUs).
I do not want to get to the point where every type in libstdc++ exists
multiple times and you select some combination via command-line flags.
It's already becoming unmanageable with multiple std::string and long
double ABIs.
Also, in practice, I don't see a need. The common cases where this bug
arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
mean that you can always use "atomic". For bare metal ARM toolchains
you probably don't want to be mixing CUs built against different
versions of libstdc++ anyway. You have your toolchain for the board,
and you use that. If it is configured to use "atomic", then that's
what you get. If it's configured to use "mutex", then that's what you
get. You probably don't want to use a toolchain configured for a
different board, but you could use --with-libstdcxx-lock-policy to
ensure a consistent policy across those toolchains if needed.
It's also possible for problems to arise on i386. If you build GCC for
i386 then it will choose "mutex" and be incompatible with a libstdc++
built for i486 or later. In that case, you could use the option
--with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
then you'd need to link to libatomic to provide the ops, as there are
no kernel helpers for i386 in libgcc).
There are a few other things we could do though:
1) As well as the __default_lock_policy value, we could add a
"library's preferred lock policy" value, which could be different from
__default_lock_policy. That policy would get used for the __shared_ptr
objects in the libstdc++ API/ABI, and would have to be consistent for
all CUs in the library and using its headers. That solves the problem
of user CUs being incompatible with the library because they used
-march. It doesn't change the fact that a libstdc++.so using "atomic"
is not compatible with one using "mutex". For this to work the library
always needs to use std::__shared_ptr<X, __library_lock_policy>
instead of std::shared_ptr<X> so that the policy is explicit (which I
think is a good rule anyway).
1a) Then (if we wanted) we could restore the old behaviour where
__default_lock_policy is chosen by the preprocessor and can vary
between user CUs. If users want to compile different CUs with
different lock policies, that's their choice. They risk creating
incompatible shared_ptr objects which cannot be passed between CUs,
but it becomes their responsibility to get it right. The library
always uses its preferred policy, and so won't get it wrong. This
seems easy for users to get wrong, but would arguably be more
backwards compatible (by still allowing users to get silent UB that
only fails at runtime).
1b) Or (if we wanted) we could still fix __default_lock_policy at
configure-time, but add a second version of the new option
--with-libstdcxx-lock-policy, so the default policy for user CUs and
the library's preferred policy can be chosen independently. You could
say you want the library to always use "mutex" but have two builds of
GCC, one where user CUs use "atomic" and one where they use "mutex".
The libstdc++.so used by those two GCC builds would be compatible, but
user CUs that use std::shared_ptr would not be.
Options 1a and 1b are mutually exclusive.
2) We could consider using __attribute__((__abi_tag__("..."))) on
std::shared_ptr when the lock policy chosen by the preprocessor
doesn't match what would have been chosen when libstdc++ was built. So
if a user compiles with a -march option that changes the lock
policy, we change the mangling of std::shared_ptr. This would still
allow incompatible std::shared_ptr objects, but they would cause
linker errors. For PR 42734 and PR 67843 that would have meant the
programs fail to link, instead of having runtime UB. The solution I
committed makes them link and run correctly, which seems better than
just saying "no, you can't do that".
3) We could change libstdc++'s configure to automatically override
--with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
(because we know the kernel helpers are available). That causes an ABI
change for a GCC built for --target=armv5t-*-linux* so I didn't do
that by default. Packagers who want that can use the --with option
explicitly to build a libstdc++ with atomic refcounting that can be
used on any armv5t or later CPU, allowing users to choose a newer
-march for their own code. (Until my patch that didn't work, because
Option 1) on its own seems to have no advantage over what I committed.
Option 1a) is still too easy for users to get wrong and get UB, only
now it's all their fault not my fault. That doesn't really help users.
Option 1b) makes it harder for users to get wrong (they have to mix
CUs built by different toolchains, not just built with different
-march options). I'm not sure there's much benefit to being able to
control the library policy separately from the user policy though.
Option 2) makes the failures happen earlier (when linking, not
runtime) but doesn't actually make anything work that failed
previously.
Option 3) is not my call to make. If the ARM maintainers want to
change the default older arm-linux targets I have no objections.