Apologies for the last mail and noise, it was ment to be a direct reply.

On Fri, 10 Apr, 2026, 8:10 am Ganesh Sivakumar, <[email protected]>
wrote:

> Hey Joey, This is Ganesh. I am building a new portable beam runner in
> rust. I guess we are working in same territory, as of now the runner can
> accept pipeline from client via job service and run greedy fusion and
> create execution graph. I am currently working on control and data channel
> orchestration with runner and worker(SDK harness.  Lot of work to do.
>
> With your question, based on my understanding your trying send multiple
> batches of Elements to worker once and trying to check if worker is ready
> to accept it.
>
> I am not at your phase yet so mine may not be the best approach. I guess
> you are trying to avoid the setup cost for every batch. One way is to pool
> elements and make the batch size larger but the runner might spend some
> idle time in collecting the elements and would introduce some latency. 2nd
> is to spin up multiple workers and send each batch to each worker so that
> they run the batch in parallel and save some time.
>
> Thanks,
> Ganesh.
>
> On Fri, 10 Apr, 2026, 3:04 am Joey Tran, <[email protected]> wrote:
>
>> Hey all,
>>
>> I'm working on extending my pipeline runner so that multiple batches of
>> inputs are streamed into a single bundle (to amortize bundle setup time)
>> rather than just sending a single batch of inputs into the data channel and
>> then closing the data channel.
>>
>> I'm using the ProcessBundleProgressRequest[1] to poll my workers to see
>> if they're idle and ready for more inputs using the
>> `consuming_received_data` field, but I noticed that
>> `consuming_received_data` doesn't actually imply whether or not the worker
>> is busy or ready for work. Specifically, while the worker is setting up
>> (e.g. running `setup_bundle`), `consuming_received_data` is set to False
>> even though the worker is not actually ready for work.
>>
>> This makes it a bit awkward for runners to know whether or not it's safe
>> to send work to a worker. If `setup_bundle` takes a very long time, then a
>> runner might repeatedly send more and more work if it uses
>> `consuming_received_data` as an indicator of idleness. Am I misusing
>> `consuming_received_data`? Is there another more reliable way to detect
>> worker idleness?
>>
>> cheers,
>> Joey
>>
>>
>> https://github.com/apache/beam/blob/357b7206f82f788b55732247c9096527f92adf4f/model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto#L476
>>
>

Reply via email to