Useful read on this topic today from the Julia language https://julialang.org/blog/2019/07/multithreading
On Tue, Jul 23, 2019, 12:22 AM Jacques Nadeau <jacq...@apache.org> wrote: > There are two main things that have been important to us in Dremio around > threading: > > Separate threading model from algorithms. We chose to do parallelization at > the engine level instead of the operation level. This allows us to > substantially increase parallelization while still maintaining a strong > thread prioritization model. This contrasts to some systems like Apache > Impala which chose to implement threading at the operation level. This has > ultimately hurt their ability for individual workloads to scale out within > a node. See the experimental features around MT_DOP when the tried to > retreat from this model and struggled to do so. It serves as an example of > the challenges if you don't separate data algorithms from threading early > on in design [1]. This intention was core to how we designed Gandiva, where > an external driver makes decisions around threading and the actual > algorithm only does small amounts of work before yielding to the driver. > This allows a driver to make parallelization and scheduling decisions > without having to know the internals of the algorithm. (In Dremio, these > are all covered under the interfaces described in Operator [2] and it's > subclasses that together provide a very simple state of operation states > for the driver to understand. > > The second is that the majority of the data we work with these days is > primarily in high latency cloud storage. While we may stage data locally, a > huge amount of reads are impacted by the performance of cloud stores. To > cover these performance behaviors we did two things, the first was > introduce a very simple to use async reading interface for data, seen at > [3] and introduce a collaborative way that individual tasks could declare > their blocking state to a central coordinator [4]. Happy to cover these in > more detail if people are interested. In general, using these techniques > have allowed us to tune many systems to a situation where the (highly) > variable latency of cloud stores like S3 and ADLS can be mostly cloaked by > aggressive read ahead and what we call predictive pipelining (where reading > is guided based on latency performance characteristics along with knowledge > of columnar formats like Parquet). > > [1] > > https://www.cloudera.com/documentation/enterprise/latest/topics/impala_mt_dop.html#mt_dop > [2] > > https://github.com/dremio/dremio-oss/blob/master/sabot/kernel/src/main/java/com/dremio/sabot/op/spi/Operator.java > [3] > > https://github.com/dremio/dremio-oss/blob/master/sabot/kernel/src/main/java/com/dremio/exec/store/dfs/async/AsyncByteReader.java > [4] > > https://github.com/dremio/dremio-oss/blob/master/sabot/kernel/src/main/java/com/dremio/sabot/threads/sharedres/SharedResourceManager.java > > On Mon, Jul 22, 2019 at 9:56 AM Antoine Pitrou <anto...@python.org> wrote: > > > > > Le 22/07/2019 à 18:52, Wes McKinney a écrit : > > > > > > Probably the way is to introduce async-capable read APIs into the file > > > interfaces. For example: > > > > > > file->ReadAsyncBlock(thread_ctx, ...); > > > > > > That way the file implementation can decide whether asynchronous logic > > > is actually needed. > > > I doubt very much that a one-size-fits-all > > > concurrency solution can be developed -- in some applications > > > coarse-grained IO and CPU task scheduling may be warranted, but we > > > need to have a solution for finer-grained scenarios where > > > > > > * In the memory-mapped case, there is no overhead and > > > * The programming model is not too burdensome to the library developer > > > > Well, the asynchronous I/O programming model *will* be burdensome at > > least until C++ gets coroutines (which may happen in C++20, and > > therefore be usable somewhere around 2024 for Arrow?). > > > > Regards > > > > Antoine. > > >