Re: ExecBatch in arrow execution engine

2022-05-09 Thread Yue Ni
Thanks all for the suggestions. > A possible solution is to derive from ExecBatch your own class I didn't give it a try yet but that is my initial thought and I am not sure if there is idiomatic and better solution in the query engine to do this. > Does the existing filter "guarantee" mechanism w

Re: ExecBatch in arrow execution engine

2022-05-09 Thread David Li
Also see this related discussion, which petered out: https://issues.apache.org/jira/browse/ARROW-12873 On Mon, May 9, 2022, at 15:40, Weston Pace wrote: > Any kind of "batch-level" information is a little tricky in the > execution engine because nodes are free to chop up and recombine > batches a

Re: ExecBatch in arrow execution engine

2022-05-09 Thread Weston Pace
Any kind of "batch-level" information is a little tricky in the execution engine because nodes are free to chop up and recombine batches as they see fit. For example, the output of a join node is going to contain data from at least two different input batches. Even nodes with a single input and s

Re: ExecBatch in arrow execution engine

2022-05-09 Thread Yaron Gvili
Hi Yue, >From my limited experience with the execution engine, my understanding is that >the API allows streaming only an ExecBatch from one node to another. A >possible solution is to derive from ExecBatch your own class (say) >RichExecBatch that carries any extra metadata you want. If in your