[Lldb-commits] [PATCH] D75711: [NFC] Have ThreadPlans hold onto the Process & TID, rather than the Thread

Pavel Labath via Phabricator via lldb-commits Tue, 10 Mar 2020 03:05:04 -0700

labath added a comment.

In D75711#1912902 <https://reviews.llvm.org/D75711#1912902>, @jingham wrote:


> In D75711#1912230 <https://reviews.llvm.org/D75711#1912230>, @labath wrote:
>
> > So this is technically not "our" fault. The plugin is doing something 
> > unsupported and gets weird behavior as a result. Now, I am not saying we 
> > should throw this plugin (maybe the only actually used os plugin) 
> > overboard. However, I was hoping that the process of supporting it would 
> > involve some improvements to the os plugin interface (so that it can do 
> > what it wants in a supported way), rather than declaring support for 
> > whatever it is the plugin is doing, even though we know it to be 
> > problematic (like, not being able to tell when we can throw thread plans 
> > away).
>
>
> I think when presenting an OS's threads through the funnel of a monitor stub 
> that only knows about cores and not about the OS's concept of threads, you 
> will very likely run into the problem where reconstructing all the threads at 
> every stop is expensive and not particularly desirable.  Two of the three 
> instances of OS plugins that I know about do it this way.  The other is for 
> wrapping a cooperative multi-threading library, where fetching all the 
> threads was pretty easy and there weren't too many.  But for instance if 
> getting all the threads really is slow, for the OS plugin to be useful we 
> need to allow it to do partial reporting to be useful.
>
> Anyway, I don't know that we need anything more added to the OS interface.  
> There's a "make a thread for a TID" interface, which we can use to reap the 
> persistent part of the thread, for instance when we make a public stop.  If 
> that's fast enough (I haven't gotten to that point with the xnu plugin) we 
> can us that.  We can add an API to query whether the plugin returns all 
> threads, and gate our behavior on that rather than on "Has OS Plugin".  But 
> that's about all I can think of.


Yes, that's sort of what I was expecting. Another idea I had was to have a way 
to avoid reading the register context initially, and only have it be fetched on 
demand.

> I also have to support all the OS plugins in all the xnu dSYM's that are 
> currently in the wild, too.  So whatever we do has to work without requiring 
> added behaviors.

Yep, anything we do shouldn't break existing plugins, but I think its fair to 
require some modification to an existing plugin in order to enable a new 
feature (such as stepping across thread reschedules).

> 
> 
>> That said, I don't really use or care about the os plugins, so if using it 
>> results in leaking thread plans, then whatever. The thing I want to 
>> understand is what kind of requirements does this place on the rest of lldb. 
>> Which leads me to my next comment..
>> 
>>> but I think it fits the kernel folks workflow, since the actual number of 
>>> threads in the kernel is overwhelming and not terribly useful.  But you are 
>>> right, if we wanted to delay getting threads on private stops, but make 
>>> UpdateThreadListIfNeeded force a fetch, then we would have to add some API 
>>> to allow us to fetch only the "on core" threads along the code path that 
>>> handles internal stops.
>>> 
>>> In any case, in order to support having internal stops not fetch all 
>>> threads - which we are currently doing with OS Plugins - we need to be able 
>>> to persist the stateful data lldb holds for that thread across stops where 
>>> we don't realize the Thread object.  That is the sole ambition of this set 
>>> of patches.
>> 
>> We've agreed that we would like to avoid gathering the thread data for 
>> internals stops that don't need it. Avoiding creating (updating) 
>> lldb_private::Thread objects, and having the not-(yet)-realized threads live 
>> as a tid_t is one way to achieve this goal, but I am not sure if it is the 
>> right one. I am not really against the idea -- since threads can come and go 
>> between stops pretty much arbitrarily (even without os plugins), it may make 
>> sense to have everything hold onto them as simple handles, and force a fetch 
>> through some Process api when one needs to access them may simplify some 
>> things (for one, the lifetime of Threads could become much stricter).
>> 
>> I can certainly believe that changing thread plans to use a tid_t instead of 
>> a Thread& is simpler than making Thread object management lazy, but I am not 
>> sure that is a fair comparison. Thread plans are only a single component 
>> which juggles Thread objects. There are many other pieces of code, which 
>> assume that the lifetime of a Thread object matches that of the entity it 
>> describes. SBThread is one such thing. If SBThread keeps referring to the 
>> actual thread as a pointer, then it will lose track of it when the actual 
>> thread goes away and reappears. Same goes for ValueObjects and SBValues -- 
>> if lldb_private::Thread is meant to be transient, then those should also be 
>> updated to use a tid_t(*), but then the changes required to do that become 
>> much larger, and it's not clear to me whether that is really worth it, since 
>> this is just kind of reinventing `weak_ptr<Thread>` (but with the downside 
>> mentioned in (*)).
> 
> Everything that holds onto its state using an ExecutionContextRef already 
> holds onto the m_tid as the real entity, and the Thread is only a cache (see 
> ExecutionContextRef::GetThreadSP).  That probably hasn't been pushed hard, 
> and I've only written a few tests with actually vanishing threads so far.  
> But if it doesn't work properly then that's a bug to fix not a design to 
> change.
> 
> ValueObjects (and therefore SBValues) use an ExecutionContextRef in their 
> UpdatePoint, they don't hold onto a thread.  SBThread's also hold 
> ExecutionContextRef's not ThreadWP's to get back to their actual thread.  In 
> general, anything that wants to know "where am I in the state space of the 
> process" is supposed to use an ExecutionContextRef to store its state not 
> individual target/process/thread/etc.  So I'm not sure this is so far off, 
> but we would have to write some more tests with threads coming and going to 
> make sure this is used consistently.

This sounds very convincing to me. I was under the impression that this is kind 
of inventing something new, but as there already is a notion of floating Thread 
objects, then I don't have a problem with that. I'd say the existing 
ExecutionContextRef mechanism is working fairly well, as I haven't yet noticed 
how it re-fetches threads.

> 
> 
>> (*) I don't believe tid_t is actually a good way to keep track of the 
>> identity of a thread across time. It does uniquely identify a thread at any 
>> given point in time, but thread ids can be recycled (sometimes very quickly 
>> -- running the lldb test suite just once causes linux to wrap its pid 
>> space), which means you can't guarantee it will always refer to the same 
>> physical thread. If we're going to be using numbers to identify threads I 
>> think it should be some other kind of an identifier -- but that is sort of 
>> what a `Thread*` already is.
> 
> Not sure I understand this.  If I run the process and a real thread exits and 
> another one is created with the same TID, all before stopping, when I go to 
> merge the thread lists on stop I'm going to think this different thread was 
> the same thread because it had the same TID.  I don't see how using Thread 
> pointers rather than TID's helps with this.  I still have to trust the TID I 
> got from the monitor.  We are always going to have a problem merging up 
> Threads when TID's can be reused.  I don't see how our Thread *'s - which 
> don't represent anything in the target, and thus don't help with reasoning 
> about persistence, help this at all.

a Thread* helps precisely because it "doesn't represent anything in the 
target". Since tid_ts are supposed to match what the OS notion of a thread 
identifier is, and these are not unique across time on all systems, we'd need 
to come up with some surrogate identifier to be able to tell them apart. A 
pointer one kind of a unique identifier.

However, you are right that this alone would not help us maintain accurate 
thread identity because the tid could come and go without us noticing (though 
it would help if we do observe the thread to disappear). Further changes would 
be needed to make that work (and it's not clear whether they are actually worth 
it). I was mainly trying to ensure that any new concept we come up with does 
not get in the way if that becomes necessary. But as we are not inventing a new 
concept here, this point is moot.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75711/new/

https://reviews.llvm.org/D75711



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D75711: [NFC] Have ThreadPlans hold onto the Process & TID, rather than the Thread

Reply via email to