pepijnve commented on issue #16490: URL: https://github.com/apache/datafusion/issues/16490#issuecomment-2992814811
The guide behind the developer login wall has much more detailed information if you're interested https://developer.apple.com/download/apple-silicon-cpu-optimization-guide/ This reminds me a bit of the morsel paper. I wonder if you could achieve similar results with a repartition strategy that takes queue length into account or some other metric that approximates load per thread. In other words, try to send data down the lane that's processing the fastest. You might risk getting a rather uneven data distribution between the partitions that way though. How do the push engines deal with that? If I understood it correctly from the paper they also keep data in separate per-thread storage before merging into something global. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org