labath added a comment.

Thanks for the detailed response. I'll try to be equally understandable.

In D75711#1910155 <https://reviews.llvm.org/D75711#1910155>, @jingham wrote:

> In D75711#1909055 <https://reviews.llvm.org/D75711#1909055>, @labath wrote:
>
> > I am somewhat worried about the direction this is taking. Maybe vanishing 
> > threads are fine for os-level debugging (because there the "real" threads 
> > are the cpu cores, and the os threads are somewhat imaginary), but I 
> > definitely wouldn't want to do it for regular debugging. If my process has 
> > 1000 threads (not completely uncommon) then I don't think my debugger would 
> > be doing me a service showing me just the three it thinks are interesting 
> > and completely disregarding the others. Having a mode (it could even be 
> > default) which only highlights the interesting ones -- yes, that would 
> > definitely be useful. Making sure it doesn't touch the other threads when 
> > it doesn't need to -- absolutely. But I don't think it should pretend the 
> > other threads don't exist at all, because I may still have a reason to 
> > inspect those threads (or just to know that they're there). In fact, I 
> > wouldn't be surprised if this happened to the said kernel folks too, and 
> > they end up adding a custom command, which would enumerate all "threads".
>
>
> First off, I also think that hiding some threads from users, or forcing 
> people to say "thread list --force" or something like that to see them all 
> would not be a user-friendly design.  I'm not suggesting imposing that on 
> general debugging.  But if an OS plugin, for instance, decides it is too 
> expensive to do this except as an explicit gesture, we need to support that 
> decision.
>
> But lldb stops many times during the course of a step where it doesn't return 
> control to the user - all the step into code with no debug info, step out 
> again, step across dynamic linker stubs or step through ObjC message dispatch 
> dances we do involve many private stops per public spot...  If a thread has 
> not stopped "for a reason" at one of those stops, it takes no part in the 
> "should stop" and "will resume" negotiations and reconstructing it for that 
> stop really does serve no purpose.  There's no way that the user will get a 
> chance to see it.
>
> So if we could avoid reading them in, that might make stepping go faster in 
> the presence of lots of threads.  BTW, with the accelerated stop packets 
> where we get all the threads and their stop reasons on stop, I'm not sure 
> this would really save all that much time.  But talking to less capable 
> stubs, it really can make a difference.  Greg did a bunch of work to make 
> sure stepping stopped as few times as possible because the Facebook 
> gdb-remote stub didn't support these accelerated packets and getting all the 
> thread info on each private stop was making stepping go really slowly.  I 
> didn't understand the motivation at first because for debugserver the changes 
> didn't really make any difference.
>
> Also, I don't think these two desires: not fetching threads when we don't 
> need them (e.g. private stops) and showing them all to the user whenever they 
> ask, are contradictory.  After all we still have the gate of 
> UpdateThreadListIfNeeded.  All the commands that touch the thread list go 
> through this. So even if we don't gather all the threads when we stop, we 
> already have a mechanism in place whereby any command that actually touches 
> the thread list can fetch them on demand.


Yes, I agree with everything above. If lldb does not need information about all 
threads in order to service a internal stop, then it shouldn't read it in, and 
the os plugin (or the remote stub) should not need to provide it. In fact I 
would even go so far as to say that we shouldn't need to read in this 
information for a public stop until a user performs an operation (e.g. `thread 
list`) that requires information about all threads.

> 
> 
>> What was actually the slow part here? Was it the os plugin reading the 
>> thread lists from the kernel memory? Or the fact that lldb is slow when it 
>> is confronted with them?
>> 
>> If it is the latter, then maybe we should fix that. For example, I think I 
>> remember lldb likes to check the PC of every thread before it does any kind 
>> of a resume -- I am not convinced that is really needed.
>> 
>> If fetching the threads is slow, then perhaps we could have an api, where 
>> the os threads are produced incrementally, and we ensure that lldb does not 
>> query for os threads needlessly during the hot paths (e.g. internal stops 
>> for servicing thread plans).
> 
> The thing that is slow is gathering all the threads.  On Darwin, there's a 
> kernel activation per user space thread and then a whole bunch of kernel 
> threads.  On a busy system, that ends up being a whole lot of threads.  
> Producing them through the OS plugin interface involves a bunch of memory 
> reads to fetch the thread data, and following pointers around to get the 
> register state for the threads, etc.
> 
> Note also, the way the OS Plugins currently work, when the process stops the 
> OS Plugin just gets asked to produce the threads backing the "on core" 
> threads.  That's all UpdateThreadListIfNeeded will fetch, and there's no lldb 
> way to get the OS Plugin to fetch all the threads it can produce.  The only 
> way to do that with the Darwin kernel is to run other commands in the set of 
> Python commands provided by the kernel team that either fetches a non-current 
> thread by ID or forces fetching all the threads.
> 
> So the current state when working with OS Plugins violates your principle 
> given above,

IIUC, this principle is only violated for this particular os plugin, which 
violates the plugin contract and does not return all threads through the 
`get_thread_info` interface. I don't know whether this contract is spelled out 
anywhere, but it seems like at least the code in lldb behaves that way.

So this is technically not "our" fault. The plugin is doing something 
unsupported and gets weird behavior as a result. Now, I am not saying we should 
throw this plugin (maybe the only actually used os plugin) overboard. However, 
I was hoping that the process of supporting it would involve some improvements 
to the os plugin interface (so that it can do what it wants in a supported 
way), rather than declaring support for whatever it is the plugin is doing, 
even though we know it to be problematic (like, not being able to tell when we 
can throw thread plans away).

That said, I don't really use or care about the os plugins, so if using it 
results in leaking thread plans, then whatever. The thing I want to understand 
is what kind of requirements does this place on the rest of lldb. Which leads 
me to my next comment..

> but I think it fits the kernel folks workflow, since the actual number of 
> threads in the kernel is overwhelming and not terribly useful.  But you are 
> right, if we wanted to delay getting threads on private stops, but make 
> UpdateThreadListIfNeeded force a fetch, then we would have to add some API to 
> allow us to fetch only the "on core" threads along the code path that handles 
> internal stops.
> 
> In any case, in order to support having internal stops not fetch all threads 
> - which we are currently doing with OS Plugins - we need to be able to 
> persist the stateful data lldb holds for that thread across stops where we 
> don't realize the Thread object.  That is the sole ambition of this set of 
> patches.

We've agreed that we would like to avoid gathering the thread data for 
internals stops that don't need it. Avoiding creating (updating) 
lldb_private::Thread objects, and having the not-(yet)-realized threads live as 
a tid_t is one way to achieve this goal, but I am not sure if it is the right 
one. I am not really against the idea -- since threads can come and go between 
stops pretty much arbitrarily (even without os plugins), it may make sense to 
have everything hold onto them as simple handles, and force a fetch through 
some Process api when one needs to access them may simplify some things (for 
one, the lifetime of Threads could become much stricter).

I can certainly believe that changing thread plans to use a tid_t instead of a 
Thread& is simpler than making Thread object management lazy, but I am not sure 
that is a fair comparison. Thread plans are only a single component which 
juggles Thread objects. There are many other pieces of code, which assume that 
the lifetime of a Thread object matches that of the entity it describes. 
SBThread is one such thing. If SBThread keeps referring to the actual thread as 
a pointer, then it will lose track of it when the actual thread goes away and 
reappears. Same goes for ValueObjects and SBValues -- if lldb_private::Thread 
is meant to be transient, then those should also be updated to use a tid_t(*), 
but then the changes required to do that become much larger, and it's not clear 
to me whether that is really worth it, since this is just kind of reinventing 
`weak_ptr<Thread>` (but with the downside mentioned in (*)).

(*) I don't believe tid_t is actually a good way to keep track of the identity 
of a thread across time. It does uniquely identify a thread at any given point 
in time, but thread ids can be recycled (sometimes very quickly -- running the 
lldb test suite just once causes linux to wrap its pid space), which means you 
can't guarantee it will always refer to the same physical thread. If we're 
going to be using numbers to identify threads I think it should be some other 
kind of an identifier -- but that is sort of what a `Thread*` already is.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75711/new/

https://reviews.llvm.org/D75711



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to