On Wed, Jun 5, 2019 at 10:42 PM 김규래 <msc...@naver.com> wrote:

> > On Wed, Jun 5, 2019 at 9:25 PM 김규래 <msc...@naver.com> wrote:
>
> >
> > > Hi, thanks for the detailed explanation.
> > > I think I now get the picture.
> > > Judging from my current understanding, the task-parallelism currently
> works as follows:
> > > 1. Tasks are placed in a global shared queue.
> > > 2. Workers consume the tasks by bashing the queue in a while loop,
> just as self-scheduling (dynamic scheduling)/
> > >
> > > Then the improvements including work-stealing must be done by:
> > > 1. Each worker holds a dedicated task queue reducing the resource
> contention.
> > > 2. The tasks are distributed in a round-robin fashion
>
> > For nested task submission (does OpenMP support that?) you probably
> > want to submit to the local queue rather than round-robin, no?
>
>
>
> Hi Janne,
>
> I believe you're talking about spawning child tasks within an already
> running task, which I believe is supported by OpenMP.
>
> In this case, pushing to the local queue could introduce a deadlock if the
> master thread waits on the spawned tasks.
>
> A short one though since work-stealing can resolve the deadlock.
>
> A better way to handle this is to round-robin the child tasks to the
> queues excluding the queue of the thread consuming the current task.
>
> Then waiting on the tasks should never cause a deadlock.
>
> Or did I misunderstand your question?
>
> Maybe there is a specific reason for avoiding avoid round-robin that I
> missed?
>
Better cache locality.

Another option, which I guess starts to go out of scope of your gsoc, is
parallel depth first (PDF) search (Blelloch 1999) as an alternative to work
stealing. Here's a presentation about some recent work in this area,
although for Julia and not OpenMP (no idea if PDF would fit with OpenMP at
all): https://www.youtube.com/watch?v=YdiZa0Y3F3c

-- 
Janne Blomqvist

Reply via email to