#1 is no big deal, we could just allocate one in a global class somewhere. #2 actually seems quite desirable, is there any reason you don't want that?
#3 seems like a win for performance since no locks have to be acquired to manage the collection of threads On Sun, Apr 30, 2017 at 9:41 PM Scott Smith <scott.sm...@purestorage.com> wrote: > The overall concept is similar; it comes down to implementation details > like > 1. llvm doesn't have a global pool, it's probably instantiated on demand > 2. llvm keeps threads around until the pool is destroyed, rather than > letting the threads exit when they have nothing to do > 3. llvm starts up all the threads immediately, rather than on demand. > > Overall I like the current lldb version better than the llvm version, but > I haven't examined any of the use cases of the llvm version to know whether > it could be dropped in without issue. However, neither does what I want, > so I'll move forward prototyping what I think it should do, and then see > how applicable it is to llvm. > > On Sun, Apr 30, 2017 at 9:02 PM, Zachary Turner <ztur...@google.com> > wrote: > >> Have we examined llvm::ThreadPool to see if it can work for our needs? >> And if not, what kind of changes would be needed to llvm::ThreadPool to >> make it suitable? >> >> On Fri, Apr 28, 2017 at 8:04 AM Scott Smith via lldb-dev < >> lldb-dev@lists.llvm.org> wrote: >> >>> Hmmm ok, I don't like hard coding pools. Your idea about limiting the >>> number of high level threads gave me an idea: >>> >>> 1. System has one high level TaskPool. >>> 2. TaskPools have up to one child and one parent (the parent for the >>> high level TaskPool = nullptr). >>> 3. When a worker starts up for a given TaskPool, it ensures a single >>> child exists. >>> 4. There is a thread local variable that indicates which TaskPool that >>> thread enqueues into (via AddTask). If that variable is nullptr, then it >>> is the high level TaskPool.Threads that are not workers enqueue into this >>> TaskPool. If the thread is a worker thread, then the variable points to >>> the worker's child. >>> 5. When creating a thread in a TaskPool, it's thread count AND the >>> thread count of the parent, grandparent, etc are incremented. >>> 6. In the main worker loop, if there is no more work to do, OR the >>> thread count is too high, the worker "promotes" itself. Promotion means: >>> a. decrement the thread count for the current task pool >>> b. if there is no parent, exit; otherwise, become a worker for the >>> parent task pool (and update the thread local TaskPool enqueue pointer). >>> >>> The main points are: >>> 1. We don't hard code the number of task pools; the code automatically >>> uses the fewest number of taskpools needed regardless of the number of >>> places in the code that want task pools. >>> 2. When the child taskpools are busy, parent taskpools reduce their >>> number of workers over time to reduce oversubscription. >>> >>> You can fiddle with the # of allowed threads per level; for example, if >>> you take into account number the height of the pool, and the number of >>> child threads, then you could allocate each level 1/2 of the number of >>> threads as the level below it, unless the level below wasn't using all the >>> threads; then the steady state would be 2 * cores, rather than height * >>> cores. I think that it probably overkill though. >>> >>> >>> On Fri, Apr 28, 2017 at 4:37 AM, Pavel Labath <lab...@google.com> wrote: >>> >>>> On 27 April 2017 at 00:12, Scott Smith via lldb-dev >>>> <lldb-dev@lists.llvm.org> wrote: >>>> > After a dealing with a bunch of microoptimizations, I'm back to >>>> > parallelizing loading of shared modules. My naive approach was to >>>> just >>>> > create a new thread per shared library. I have a feeling some users >>>> may not >>>> > like that; I think I read an email from someone who has thousands of >>>> shared >>>> > libraries. That's a lot of threads :-) >>>> > >>>> > The problem is loading a shared library can cause downstream >>>> parallelization >>>> > through TaskPool. I can't then also have the loading of a shared >>>> library >>>> > itself go through TaskPool, as that could cause a deadlock - if all >>>> the >>>> > worker threads are waiting on work that TaskPool needs to run on a >>>> worker >>>> > thread.... then nothing will happen. >>>> > >>>> > Three possible solutions: >>>> > >>>> > 1. Remove the notion of a single global TaskPool, but instead have a >>>> static >>>> > pool at each callsite that wants it. That way multiple paths into >>>> the same >>>> > code would share the same pool, but different places in the code >>>> would have >>>> > their own pool. >>>> > >>>> >>>> I looked at this option in the past and this was my preferred >>>> solution. My suggestion would be to have two task pools. One for >>>> low-level parallelism, which spawns >>>> std::thread::hardware_concurrency() threads, and another one for >>>> higher level tasks, which can only spawn a smaller number of threads >>>> (the algorithm for the exact number TBD). The high-level threads can >>>> access to low-level ones, but not the other way around, which >>>> guarantees progress. >>>> >>>> I propose to hardcode 2 pools, as I don't want to make it easy for >>>> people to create additional ones -- I think we should be having this >>>> discussion every time someone tries to add one, and have a very good >>>> justification for it (FWIW, I think your justification is good in this >>>> case, and I am grateful that you are pursuing this). >>>> >>>> pl >>>> >>> >>> _______________________________________________ >>> lldb-dev mailing list >>> lldb-dev@lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >>> >> >
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev