> > If a method could potentially run some kind of long term blocking I/O > wait then yes. So reading / writing tables & datasets, IPC, > filesystem APIs, etc. will all need to adapt. It doesn't have to be > all at once. CPU only functions would remain as they are. So table > manipulation, compute functions, etc. would remain as they are. For > example, there would never be any advantage to creating an > asynchronous method to drop a column from a table.
My main concern is around the "viralness" of Futures. I think they are good in some cases but can become hard to reason about/error prone if you aren't used to working with them day in/out. I don't have any concrete recommendation at this point, just something we should be careful about when doing the refactoring. On Tue, Feb 16, 2021 at 4:43 AM Antoine Pitrou <anto...@python.org> wrote: > > Thanks for this useful writeup and the enseuing discussion. What you're > proposing basically looks sound to me. > > Regards > > Antoine. > > > Le 16/02/2021 à 09:29, Weston Pace a écrit : > > Thanks for the input. I appreciate you both taking the time to look > > through this. I'll consolidate the points here. > > > > From Wes: > > > >> I hypothesize that the bottom of the stack is a thread pool with a > > queue-per-thread that implements work stealing. > > > > Yes. I think there may be two pools (one for I/O and one for CPU) but > > for the work I'm intending (unblocking deadlock and getting I/O off > > CPU pool) what is there today is sufficient. Moving to work stealing > > would be a secondary priority. It's possible a query scheduler might > > have special needs. For example, quickstep wants a dedicated "master > > scheduler" thread which cannot be stolen from. > > > >> Some code paths might use this low-level task API directly > > > > Yes, I have no intention of completely shadowing the lower level > > executor, only adding optional utilities useful for building > > pipelines. > > > >> I've brought this up in the past, but if we are comfortable with more > threads than CPU cores...let me know if it doesn't make sense. > > > > It makes sense and other projects (e.g. C#, postgres) use this kind of > > "more threads than CPU" model. It can be simpler because it doesn't > > require you to split a long running job with blocking waits into > > smaller jobs but it won't offer more capability/functionality. Both > > the quickstep model and the asynchronous generators model have > > mechanisms in place so that "follow-up tasks" which are ready to run > > will run immediately without going to the back of the queue. This is > > done by running a callback immediately instead of submitting it to the > > thread pool. > > > > The downsides of the "more threads" appraoch are: > > * It does not support the single thread case well > > * It puts scheduling in the hands of the OS. It can actually prevent > > the benefit you described. If you create a new thread / task pool to > > address some "run right away" tasks then there is no guarantee the new > > thread will be given a core by the OS. > > * If often leads to more communication between threads within a task > > which means more potential for race conditions (this last point is a > > little subjective but matches my personal experience) > > > >> If there is a mechanism for async task producers to coexist alongside > with code that manually manages the execution order of tasks generated by > its task graph (thinking of query engine code here a la Quickstep) > > > > Yes. Any of these "small tasks" models work together pretty well. > > I've taken a decent look at quickstep and I am confident it will be > > compatible. Many of these tasks will serve as input jobs for any kind > > of query engine so it will need to be. > > > > From Micah: > > > >> One thing that I don't quite understand with this proposal is the scope? > > > > My primary scope is to get blocking I/O waits off of the CPU thread > > pool (nested parallelism and potentially improved pipelining are bonus > > benefits). > > > >> Is the intention that most APIs will eventually work with Futures > instead of raw return values > > > > If a method could potentially run some kind of long term blocking I/O > > wait then yes. So reading / writing tables & datasets, IPC, > > filesystem APIs, etc. will all need to adapt. It doesn't have to be > > all at once. CPU only functions would remain as they are. So table > > manipulation, compute functions, etc. would remain as they are. For > > example, there would never be any advantage to creating an > > asynchronous method to drop a column from a table. > > > > Publically things can remain as they are to achieve my primary goal. > > Once that is complete it could be a feature enhancement to allow an > > asynchronous alternative. It's useful for those that are already > > doing asynchronous code. In that scenario I'd propose following the > > .NET model and having duplicate APIs (e.g. Read and ReadAsync). > > > > Internally, duplicate APIs will probably be needed as things migrate > > but once more pieces move to an asynchronous model then the > > synchronous API could go away. It's hard to put any kind of number on > > something but if you look at the CSV reader then probably 10% or less > > of the code path (in units of "lines of code") is in methods returning > > futures. Probably much smaller once you consider all the code behind > > the parsing / inferring / type handling / array building. If you > > think of your code as a tree structure where the nodes are methods and > > the edges are calls the asynchronous path is fairly straight line from > > the topmost API call to some I/O handoff lower down the tree. All of > > the code in the "branches" on either side of this path is synchronous. > > > > On Mon, Feb 15, 2021 at 6:49 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> > >> I took a pass through this, thank you for a good discussion of the > >> alternative. One thing that I don't quite understand with this > proposal is > >> the scope? Is the intention that most APIs will eventually work with > >> Futures instead of raw return values (i.e. returning a Table or Record > >> batch will never be a thing, but instead you get references to > >> Future<Table>)? > >> > >> Thanks, > >> Micah > >> > >> On Mon, Feb 15, 2021 at 2:15 PM Wes McKinney <wesmck...@gmail.com> > wrote: > >> > >>> hi Weston, > >>> > >>> Thanks for putting this comprehensive and informative document > together. > >>> > >>> There are several layers of problems to consider, just thinking out > loud: > >>> > >>> * I hypothesize that the bottom of the stack is a thread pool with a > >>> queue-per-thread that implements work stealing. Some code paths might > >>> use this low-level task API directly, for example a workload putting > >>> all of its tasks into one particular queue and letting the other > >>> threads take work if they are idle. > >>> > >>> * I've brought this up in the past, but if we are comfortable with > >>> more threads than CPU cores, we may allow for the base level thread > >>> pool to be expanded dynamically. The tradeoff here is coarse > >>> granularity context switching between tasks only at time of task > >>> completion vs. the OS context-switching mid-task between threads. For > >>> example, if there is a code path which wishes to guarantee that a > >>> thread is being put to work right away to execute its tasks, even if > >>> all of the other queues are full of other tasks, then this could > >>> partially address the task prioritization problem discussed in the > >>> document. If there is a notion of a "task producer" or a "workload" > >>> and then the number of task producers exceeds the size of the thread > >>> pool, then additional an thread+dedicated task queue for that thread > >>> could be created to handle tasks submitted by the producer. Maybe this > >>> is a bad idea (I'm not an expert in this domain after all), let me > >>> know if it doesn't make sense. > >>> > >>> * I agree that we should encourage as much code as possible to use the > >>> asynchronous model — per above, if there is a mechanism for async task > >>> producers to coexist alongside with code that manually manages the > >>> execution order of tasks generated by its task graph (thinking of > >>> query engine code here a la Quickstep), then that might be good. > >>> > >>> Lots to do here but excited to see things evolve here and see the > >>> project grow faster and more scalable on systems with a lot of cores > >>> that do a lot of mixed IO/CPU work! > >>> > >>> - Wes > >>> > >>> On Tue, Feb 2, 2021 at 9:02 PM Weston Pace <weston.p...@gmail.com> > wrote: > >>>> > >>>> This is a follow up to a discussion from last September [3]. I've > >>>> been investigating Arrow's use of threading and I/O and I believe > >>>> there are some improvements that could be made. Arrow is currently > >>>> supporting two threading options (single thread and "per-core" thread > >>>> pool). Both of these approaches are hindered if blocking I/O is > >>>> performed on a CPU worker thread. > >>>> > >>>> It is somewhat alleviated by using background threads for I/O (in the > >>>> readahead iterator) but this implementation is not complete and does > >>>> not allow for nested parallelism. I would like to convert Arrow's I/O > >>>> operations to an asynchronous model (expanding on the existing futures > >>>> API). I have already converted the CSV reader in this fashion [2] as > >>>> a proof of concept. > >>>> > >>>> I have written a more detailed proposal here [1]. Please feel free to > >>>> suggest improvements or alternate approaches. Also, please let me > >>>> know if I missed any goals or considerations I should keep in mind. > >>>> > >>>> Also, hello, this email is a bit of an introduction. I have > >>>> previously made one or two small comments/changes but I am hoping to > >>>> be more involved going forwards. I've mostly worked on proprietary > >>>> test and measurement software but have recently joined Ursa Computing > >>>> which will allow me more time to work on Arrow. > >>>> > >>>> Thanks, > >>>> > >>>> Weston Pace > >>>> > >>>> [1] > >>> > https://docs.google.com/document/d/1tO2WwYL-G2cB_MCPqYguKjKkRT7mZ8C2Gc9ONvspfgo/edit?usp=sharing > >>>> [2] https://github.com/apache/arrow/pull/9095 > >>>> [3] > >>> > https://mail-archives.apache.org/mod_mbox/arrow-dev/202009.mbox/%3CCAJPUwMDmU3rFt6Upyis%3DyXB%3DECkmrjdncgR9xj%3DDFapJt9FfUg%40mail.gmail.com%3E > >>> >