alamb commented on PR #16322:
URL: https://github.com/apache/datafusion/pull/16322#issuecomment-2965062758

   > I just checked, and SortExec is also setting up RecordBatchReceiverStream. 
The worst case scenario in terms of elapsed time in the poll_next call is that 
all 10k streams are ready in one cycle. This will trigger 10k cursor 
initialization which does some non-trivial work converting the record batch. 
But the current code is doing exactly the same thing today already so it's no 
worse than the status quo as far as I can tell.
   
   For what it is worth, trying to merge 10k streams will be bad for a lot of 
reasons (merge is linear in the number of input streams)
   
   From my perspective this PR has concrete measurements that show it is faster 
and mostly theoretical conclusions that it is no worse as well.
   
   Therefore I think it is good to go and will merge it. We can adjust / change 
/ further optimize as we move on an get any additional information


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to