It sounds like you are describing two problems.

1) Idleness - Tasks are holding threads in the thread pool while they
wait for IO or some long running non-CPU task to complete.  These
threads are often in a "wait" state or something similar.
2) Fairness - The ordering of tasks is causing short tasks that could
be completed quickly from being stuck behind longer term tasks.
Fairness can be an issue even if all tasks are always in the active
state consuming CPU time.

Are both of these issues a problem?  Are you looking to address both of them?

I doubt it's much help as it is probably a more substantial change
than what you were looking for but the popular solution to #1 these
days seems to be moving toward non blocking IO with
promises/callbacks/async.  That way threads are never in the waiting
state (unless sitting idle in the pool).

-Weston

On Tue, Sep 15, 2020 at 7:00 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> In light of ARROW-9924, I wanted to rekindle the discussion about our
> approach to multithreading (especially the _programming model_) in
> C++. We had some discussions about this about 6 months ago and there
> were more discussions as I recall in summer 2019.
>
> Realistically, we are going to be consistently dealing with
> independent concurrent in-process workloads that each respectively can
> go faster by multithreading. These could be things like:
>
> * Reading file formats (CSV, Parquet, etc.) that benefit from
> multithreaded parsing/decoding
> * Reading one or more files in parallel using the Datasets API
> * Executing any number of multithreaded analytical workloads
>
> One obvious issue with our thread scheduling is the FIFO nature of the
> global thread pool. If a new independent multithreaded workload shows
> up, it has to wait for other workloads to complete before the new work
> will be scheduled. Think about a Flight server serving queries to
> users -- is it fair for one query to "hog" the thread pool and force
> other requests to wait until they can get access to some CPU
> resources? You could imagine a workload that spawns 10 minutes worth
> of CPU work, where a new workload has to wait for all of that work to
> complete before having any tasks scheduled for execution.
>
> The approach that's been taken in the Datasets API to avoid problems
> with nested parallelism (file-specific operations spawning multiple
> tasks onto the global thread pool) is simply to disable multithreading
> at the level of a single file. This is clearly suboptimal.
>
> We have additional problems in that some file-loading related tasks do
> a mixture of CPU work and IO work, and once a thread has been
> dispatched to execute one of these tasks, when IO takes place, a CPU
> core may sit underutilized while the IO is waiting.
>
> There's more aspects we can discuss, but in general I think we need to
> come up with a programming model for building our C++ system
> components with the following requirements:
>
> * Deadlocks not possible by design
> * Any component can safely use "nested parallelism" without the
> programmer having to worry about deadlocks or one task "hogging" the
> thread pool. So in other words, if there's only a single
> multithreading-capable workload running, we "let it rip"
> * Resources can be reasonably fairly allocated amongst concurrent
> workloads (think: independent requests coming in through Flight, or
> scan tasks on different Parquet files in the Datasets API). Limit
> scenarios where a new workload is blocked altogether on the completion
> of other workloads
> * A well-defined programming pattern for tasks that do a mixture of
> CPU work and IO work that allows CPU cores to be used when a task is
> waiting on IO
>
> We can't be the only project that has these problems, so I'm
> interested to see what solutions have been successfully employed by
> others. For example, it strikes me as similar to concurrency issues
> inside an analytic database. How are they preventing concurrent
> workload starvation problems or handling CPU/IO task scheduling to
> avoid CPU underutilization?
>
> Choices of which threading libraries we might use to implement a
> viable solution (e.g. TBB) seem secondary to the programming model
> that we use to implement our components.
>
> Thanks,
> Wes

Reply via email to