Hello!

I have some questions about how "pyarrow.substrait.run_query" works.

Currently run_query returns a record batch reader. Since Acero is a
push-based model and the reader is pull-based, I'd assume the reader object
somehow accumulates the batches that are pushed to it. And I wonder

(1) Does the output batches keep accumulating in the reader object, until
someone reads from the reader?
(2) Are there any back pressure mechanisms implemented to prevent OOM if
data doesn't get pulled from the reader? (Bounded cache in the reader
object?)

Thanks,
Li

Reply via email to