pepijnve commented on issue #16353: URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2961873505
At the risk of making myself unpopular, I feel it's relevant to share my findings with you guys. Working on #16322 led me into the tokio implementation, in particular it led me to this line in the [Chan implementation](https://github.com/tokio-rs/tokio/blob/master/tokio/src/sync/mpsc/chan.rs#L295). This is the code that handles RecordBatch passing in RecordBatchReceiverStream. I was immediately reminded of the cancellation discussions. Without realizing it DataFusion is actually already using Tokio's coop mechanism. This strengthens my belief that the PR that was merged is going about things the wrong way. It introduces API which overlaps 100% with something that already exists and is already being used. I don't think it's a good idea to have multiple mechanisms for the same thing. Pipeline-blocking operators exactly match the pattern described in [the Tokio cooperative scheduling documentation](https://docs.rs/tokio/latest/tokio/task/coop/index.html#cooperative-scheduling) so why would you not use the solution the runtime provides which you're already using in quite a few place anyway? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org