Thanks. I think Task Group suits my needs almost. I might need some extra layer around it.
Here is my use case. When converting a record batch to an R data structures, all R allocation has to happen on the main thread, but then filling the vectors can (for some of them) be done in a task that runs on a different thread. Not all of them, e.g. filling R character vectors needs the main thread. So I was thinking doing something like the pseudo code: auto n = num_columns(); auto serial = TaskGroup::MakeSerial(); auto threaded = TaskGroup::MakeThreaded(...); for(int i=0; i<n; i++) { - Allocate column I if( <can run in paralel> ) { threaded.AddTask(...) } else { serial.AddTash() } } - start threaded tasks - start serial tasks - combine I guess that just means I need some way to hold the tasks before they go in the task groups. > Le 3 janv. 2019 à 14:36, Antoine Pitrou <anto...@python.org> a écrit : > > > Hi Romain, > > No, it's better if you use the CPU thread pool directly (or through > TaskGroup, if that suits your execution model better). > > Regards > > Antoine. > > > Le 03/01/2019 à 14:29, Romain Francois a écrit : >> Hello, >> >> Are the functions in parallel.h the de facto model for parallelisation in >> arrow ? >> https://github.com/apache/arrow/blob/42cf69abfc1368c9884f4581811e2e7900c98fcd/cpp/src/arrow/util/parallel.h >> >> <https://github.com/apache/arrow/blob/42cf69abfc1368c9884f4581811e2e7900c98fcd/cpp/src/arrow/util/parallel.h> >> >> Just wondering if things like intel tbb were considered, IIRC managing >> threads manually can be expensive and tasks are usually cheaper. >> >> Romain >>