Sorry the link to the generator above is wrong - We traced into the code and found it uses BackgroundGenerator: https://github.com/apache/arrow/blob/78fb2edd30b602bd54702896fa78d36ec6fefc8c/cpp/src/arrow/util/async_generator.h#L1581
On Mon, Jul 25, 2022 at 11:07 AM Li Jin <ice.xell...@gmail.com> wrote: > Hi, > > Ivan and I are debugging some behavior of the source node this morning and > I was hoping to clarify that our understanding is correct. > > We observed that when using source node with a generator: > > https://github.com/apache/arrow/blob/66c66d040bbf81a4819b276aee306625dc02837c/cpp/src/arrow/compute/exec/options.h#L54 > > The source node becomes "sequential" (batches come out in order one at a > time) even with a GetCpuThreadPool() attached to the plan. > > We traced the code into this class: > > https://github.com/apache/arrow/blob/78fb2edd30b602bd54702896fa78d36ec6fefc8c/cpp/src/arrow/util/async_generator.h#L316 > > And it seems like because of the synchronization of this class, it > generates batches sequentially. Is this correct understanding and if it is > intentional that the source node are sequential when backed by a > generator? (This is actually the behavior that we want) >