Hi Arrow Community,

With the release of Acero, we were wondering if Acero can be used in a
distributed environment as for now it looks like Acero is only intended for
a local context. For example, if we have a query plan with a hash join node
at the root and multiple filter project nodes on each sides of the tree,
each side having a data source, how can we distribute the query plan
between 3 nodes: 2 nodes containing data sources and executing the filter
and project parts of the query plan in parallel while 1 node serving as the
compute node, performing only the join operation on the results from the
other 2 nodes. As per my understanding, we need some form of RPC mechanism
between the ExecNodes of an ExecPlan and would probably be integrated
within the Flight framework. Is that the right way to think about it ? Do
you think that is something the Arrow community would be interested in if
not already planning for it ? Thanks.

Jayjeet Chakraborty


-- 
*Jayjeet Chakraborty*
CS PhD student
UC Santa Cruz
California, USA

Reply via email to