hi Weston,

Thanks for putting this comprehensive and informative document together.

There are several layers of problems to consider, just thinking out loud:

* I hypothesize that the bottom of the stack is a thread pool with a
queue-per-thread that implements work stealing. Some code paths might
use this low-level task API directly, for example a workload putting
all of its tasks into one particular queue and letting the other
threads take work if they are idle.

* I've brought this up in the past, but if we are comfortable with
more threads than CPU cores, we may allow for the base level thread
pool to be expanded dynamically. The tradeoff here is coarse
granularity context switching between tasks only at time of task
completion vs. the OS context-switching mid-task between threads. For
example, if there is a code path which wishes to guarantee that a
thread is being put to work right away to execute its tasks, even if
all of the other queues are full of other tasks, then this could
partially address the task prioritization problem discussed in the
document. If there is a notion of a "task producer" or a "workload"
and then the number of task producers exceeds the size of the thread
pool, then additional an thread+dedicated task queue for that thread
could be created to handle tasks submitted by the producer. Maybe this
is a bad idea (I'm not an expert in this domain after all), let me
know if it doesn't make sense.

* I agree that we should encourage as much code as possible to use the
asynchronous model — per above, if there is a mechanism for async task
producers to coexist alongside with code that manually manages the
execution order of tasks generated by its task graph (thinking of
query engine code here a la Quickstep), then that might be good.

Lots to do here but excited to see things evolve here and see the
project grow faster and more scalable on systems with a lot of cores
that do a lot of mixed IO/CPU work!

- Wes

On Tue, Feb 2, 2021 at 9:02 PM Weston Pace <weston.p...@gmail.com> wrote:
>
> This is a follow up to a discussion from last September [3].  I've
> been investigating Arrow's use of threading and I/O and I believe
> there are some improvements that could be made.  Arrow is currently
> supporting two threading options (single thread and "per-core" thread
> pool).  Both of these approaches are hindered if blocking I/O is
> performed on a CPU worker thread.
>
> It is somewhat alleviated by using background threads for I/O (in the
> readahead iterator) but this implementation is not complete and does
> not allow for nested parallelism.  I would like to convert Arrow's I/O
> operations to an asynchronous model (expanding on the existing futures
> API).  I have already converted the CSV reader in this fashion [2] as
> a proof of concept.
>
> I have written a more detailed proposal here [1].  Please feel free to
> suggest improvements or alternate approaches.  Also, please let me
> know if I missed any goals or considerations I should keep in mind.
>
> Also, hello, this email is a bit of an introduction.  I have
> previously made one or two small comments/changes but I am hoping to
> be more involved going forwards.  I've mostly worked on proprietary
> test and measurement software but have recently joined Ursa Computing
> which will allow me more time to work on Arrow.
>
> Thanks,
>
> Weston Pace
>
> [1] 
> https://docs.google.com/document/d/1tO2WwYL-G2cB_MCPqYguKjKkRT7mZ8C2Gc9ONvspfgo/edit?usp=sharing
> [2] https://github.com/apache/arrow/pull/9095
> [3] 
> https://mail-archives.apache.org/mod_mbox/arrow-dev/202009.mbox/%3CCAJPUwMDmU3rFt6Upyis%3DyXB%3DECkmrjdncgR9xj%3DDFapJt9FfUg%40mail.gmail.com%3E

Reply via email to