On Wed, 2013-05-29 at 10:06 -0400, Ryan Johnson wrote: > On 29/05/2013 9:41 AM, Ian Lance Taylor wrote: > > On Tue, May 28, 2013 at 9:02 PM, Ryan Johnson > > <ryan.john...@cs.utoronto.ca> wrote: > >> Maybe I misunderstood... there's currently a (very small) cache > >> (unwind-dw2-fde-dip.c) that lives behind the loader mutex. It contains 8 > >> entries and each entry holds the start and end addresses for one loaded > >> object, along with a pointer to the eh_frame_header. The cache resides in > >> static storage, and so accessing it is always safe. > >> > >> I think what you're saying is that the p_eh_frame_hdr field could end up > >> with a dangling pointer due to a dlclose call? > > Yes, that can happen. > > > >> If so, my argument is that, as long as the cache is up to date as of the > >> start of unwind, any attempt to access a dangling p_eh_frame_hdr means that > >> in-use code was dlclosed, in which case unwind is guaranteed to fail > >> anyway. > >> The failure would just have different symptoms with such a cache in place. > >> > >> Am I missing something? > > I think you're right about that. But what happens if the entry is not > > in the cache? Or, do you mean you want to look in the cache before > > calling dl_iterate_phdr? That should be safe but of course you still > > need a lock as multiple threads can be manipulating the cache at the > > same time. > Per-thread cache, either allocated and populated at the start of every > unwind, or maintained in TLS with version checks against the dl > adds/subs counts. The former reduces the lock grabbing to one per > unwind; the latter would virtually eliminate locking during unwind, but > would require changes to libc to expose the adds/subs counts in some > lock-free way.
I haven't looked at the actual use case, but if we're really mostly interested in an atomic snapshot of a simple data structure, then there are various ways to do that more efficiently than via holding a lock while doing the snapshot. In particular, if the cache is read-mostly and always mapped, there's a good chance one can build something more efficient without this becoming too complex. The version checks Ryan mentions are one option. If there's a need, I can help with the synchronization bits. Torvald