Hello,
Arrow C++ comes with execution facilities (such as thread pools, async
generators...) meant to unlock higher performance by hiding IO latencies
and exploiting several CPU cores. These execution facilities also
obscure the context in which a task is executed: you cannot simply use
local, global or thread-local variables to store ancillary parameters.
Over the years we have started adding optional metadata that can be
associated with tasks:
- StopToken
- TaskHints (though that doesn't seem to be used currently?)
- some people have started to ask about IO tags:
https://github.com/apache/arrow/issues/37267
However, any such additional metadata must currently be explicitly
passed to all tasks that might make use of them.
My questions are thus:
- do we want to continue using the explicit passing style?
- on the contrary, do we want to switch to a paradigm where those, once
set, are propagated implicitly along the task dependency flow (e.g. from
the caller of Executor::Submit to the task submitted)
- are there useful or insightful precedents in the C++ ecosystem?
(note: a similar facility in Python is brought by "context vars":
https://docs.python.org/3/library/contextvars.html)
Regards
Antoine.