On Wed, Nov 28, 2018 at 1:30 PM Jonathan Wakely <jwak...@redhat.com> wrote: > > On 28/11/18 11:11 +0000, Jonathan Wakely wrote: > >On 28/11/18 11:46 +0100, Richard Biener wrote: > >>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwak...@redhat.com> wrote: > >>> > >>>This resolves a longstanding issue where the lock policy for shared_ptr > >>>reference counting depends on compilation options when the header is > >>>included, so that different -march options can cause ABI changes. For > >>>example, objects compiled with -march=armv7 will use atomics to > >>>synchronize reference counts, and objects compiled with -march=armv5t > >>>will use a mutex. That means the shared_ptr control block will have a > >>>different layout in different objects, causing ODR violations and > >>>undefined behaviour. This was the root cause of PR libstdc++/42734 as > >>>well as PR libstdc++/67843. > >>> > >>>The solution is to decide on the lock policy at build time, when > >>>libstdc++ is configured. The configure script checks for the > >>>availability of the necessary atomic built-ins for the target and fixes > >>>that choice permanently. Different -march flags used to compile user > >>>code will not cause changes to the lock policy. This results in an ABI > >>>change for certain compilations, but only where there was already an ABI > >>>incompatibility between the libstdc++.so library and objects built with > >>>an incompatible -march option. In general, this means a more stable ABI > >>>that isn't silently altered when -march flags make addition atomic ops > >>>available. > >>> > >>>To force a target to use "atomic" or "mutex" the new configure option > >>>--with-libstdcxx-lock-policy can be used. > >>> > >>>In order to turn ODR violations into linker errors, the uses of > >>>shared_ptr in filesystem directory iterators have been replaced > >>>with __shared_ptr, and explicit instantiations are declared. This > >>>ensures that object files using those types cannot link to libstdc++ > >>>libs unless they use the same lock policy. > >> > >>Would it be possible to have both in libstdc++ and with differnet mangling > >>of > >>different kind ensure compatibility between different user CUs? Or is > >>that too awkward for the user to get right? > > > >It would mean duplicating a lot more code, which is already duplicated > >once for the std::string ABI, so we'd have four permuations! > > > >It still wouldn't ensure compatibility between different user CUs, > >only between any user CU and any build of libstdc++. Different user > >CUs would still disagree on the ABI of the types, and so couldn't pass > >them between CUs. I see no advantage to supporting that for the > >std::filesystem library (unlike std::string and std::iostream, which > >are probably used in the majority of CUs). > > > >I do not want to get to the point where every type in libstdc++ exists > >multiple times and you select some combination via command-line flags. > >It's already becoming unmanageable with multiple std::string and long > >double ABIs. > > Also, in practice, I don't see a need. The common cases where this bug > arises are limited to 32-bit ARM, and for arm-linux the kernel helpers > mean that you can always use "atomic". For bare metal ARM toolchains > you probably don't want to be mixing CUs built against different > versions of libstdc++ anyway. You have your toolchain for the board, > and you use that. If it is configured to use "atomic", then that's > what you get. If it's configured to use "mutex", then that's what you > get. You probably don't want to use a toolchain configured for a > different board, but you could use --with-libstdcxx-lock-policy to > ensure a consistent policy across those toolchains if needed. > > It's also possible for problems to arise on i386. If you build GCC for > i386 then it will choose "mutex" and be incompatible with a libstdc++ > built for i486 or later. In that case, you could use the option > --with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and > then you'd need to link to libatomic to provide the ops, as there are > no kernel helpers for i386 in libgcc).
I think the default should be part of the platform ABI somehow, like the i386 incompatibility should better not arise. I suppose libstdc++ configury doesn't read in gcc/config.gcc but does it have sth similar where it's easy to see defaults for plaforms? There's configure.host but I'm not sure this is related at all. Richard. > There are a few other things we could do though: > > 1) As well as the __default_lock_policy value, we could add a > "library's preferred lock policy" value, which could be different from > __default_lock_policy. That policy would get used for the __shared_ptr > objects in the libstdc++ API/ABI, and would have to be consistent for > all CUs in the library and using its headers. That solves the problem > of user CUs being incompatible with the library because they used > -march. It doesn't change the fact that a libstdc++.so using "atomic" > is not compatible with one using "mutex". For this to work the library > always needs to use std::__shared_ptr<X, __library_lock_policy> > instead of std::shared_ptr<X> so that the policy is explicit (which I > think is a good rule anyway). > > 1a) Then (if we wanted) we could restore the old behaviour where > __default_lock_policy is chosen by the preprocessor and can vary > between user CUs. If users want to compile different CUs with > different lock policies, that's their choice. They risk creating > incompatible shared_ptr objects which cannot be passed between CUs, > but it becomes their responsibility to get it right. The library > always uses its preferred policy, and so won't get it wrong. This > seems easy for users to get wrong, but would arguably be more > backwards compatible (by still allowing users to get silent UB that > only fails at runtime). > > 1b) Or (if we wanted) we could still fix __default_lock_policy at > configure-time, but add a second version of the new option > --with-libstdcxx-lock-policy, so the default policy for user CUs and > the library's preferred policy can be chosen independently. You could > say you want the library to always use "mutex" but have two builds of > GCC, one where user CUs use "atomic" and one where they use "mutex". > The libstdc++.so used by those two GCC builds would be compatible, but > user CUs that use std::shared_ptr would not be. > > Options 1a and 1b are mutually exclusive. > > 2) We could consider using __attribute__((__abi_tag__("..."))) on > std::shared_ptr when the lock policy chosen by the preprocessor > doesn't match what would have been chosen when libstdc++ was built. So > if a user compiles with a -march option that changes the lock > policy, we change the mangling of std::shared_ptr. This would still > allow incompatible std::shared_ptr objects, but they would cause > linker errors. For PR 42734 and PR 67843 that would have meant the > programs fail to link, instead of having runtime UB. The solution I > committed makes them link and run correctly, which seems better than > just saying "no, you can't do that". > > 3) We could change libstdc++'s configure to automatically override > --with-libstdcxx-lock-policy for arm-linux so it always uses "atomic" > (because we know the kernel helpers are available). That causes an ABI > change for a GCC built for --target=armv5t-*-linux* so I didn't do > that by default. Packagers who want that can use the --with option > explicitly to build a libstdc++ with atomic refcounting that can be > used on any armv5t or later CPU, allowing users to choose a newer > -march for their own code. (Until my patch that didn't work, because > > > Option 1) on its own seems to have no advantage over what I committed. > > Option 1a) is still too easy for users to get wrong and get UB, only > now it's all their fault not my fault. That doesn't really help users. > > Option 1b) makes it harder for users to get wrong (they have to mix > CUs built by different toolchains, not just built with different > -march options). I'm not sure there's much benefit to being able to > control the library policy separately from the user policy though. > > Option 2) makes the failures happen earlier (when linking, not > runtime) but doesn't actually make anything work that failed > previously. > > Option 3) is not my call to make. If the ARM maintainers want to > change the default older arm-linux targets I have no objections. > >